Packages

c

io.delta.storage

GCSLogStore

class GCSLogStore extends HadoopFileSystemLogStore with Logging

:: Unstable ::

The LogStore implementation for GCS, which uses gcs-connector to provide the necessary atomic and durability guarantees:

1. Atomic Visibility: Read/read-after-metadata-update/delete are strongly consistent for GCS.

2. Consistent Listing: GCS guarantees strong consistency for both object and bucket listing operations. https://cloud.google.com/storage/docs/consistency

3. Mutual Exclusion: Preconditions are used to handle race conditions.

Regarding file creation, this implementation: - Opens a stream to write to GCS otherwise. - Throws FileAlreadyExistsException if file exists and overwrite is false. - Assumes file writing to be all-or-nothing, irrespective of overwrite option.

Annotations
@Unstable()
Note

This class is not meant for direct access but for configuration based on storage system. See https://docs.delta.io/latest/delta-storage.html for details.

Linear Supertypes
Logging, HadoopFileSystemLogStore, org.apache.spark.sql.delta.storage.LogStore, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. GCSLogStore
  2. Logging
  3. HadoopFileSystemLogStore
  4. LogStore
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new GCSLogStore(sparkConf: SparkConf, initHadoopConf: Configuration)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. def createTempPath(path: Path): Path
    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. def getHadoopConfiguration: Configuration
    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
  12. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  14. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  15. def invalidateCache(): Unit

    Invalidate any caching that the implementation may be using

    Invalidate any caching that the implementation may be using

    Definition Classes
    GCSLogStore → HadoopFileSystemLogStore → LogStore
  16. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  17. def isPartialWriteVisible(path: Path, hadoopConf: Configuration): Boolean

    Whether a partial write is visible when writing to path.

    Whether a partial write is visible when writing to path.

    As this depends on the underlying file system implementations, we require the input of path here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.

    The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    GCSLogStore → LogStore
  18. def isPartialWriteVisible(path: Path): Boolean

    Whether a partial write is visible when writing to path.

    Whether a partial write is visible when writing to path.

    As this depends on the underlying file system implementations, we require the input of path here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.

    The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.

    Definition Classes
    GCSLogStore → LogStore
  19. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  20. def listFrom(path: Path, hadoopConf: Configuration): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  21. def listFrom(path: Path): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  22. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  23. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  25. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  27. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  29. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  30. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  34. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  35. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  36. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  37. val preconditionFailedExceptionMessage: String
  38. def read(path: Path, hadoopConf: Configuration): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  39. def read(path: Path): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  40. def readAsIterator(path: Path, hadoopConf: Configuration): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  41. def readAsIterator(path: Path): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  42. def resolvePathOnPhysicalStorage(path: Path, hadoopConf: Configuration): Path

    Resolve the fully qualified path for the given path.

    Resolve the fully qualified path for the given path.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  43. def resolvePathOnPhysicalStorage(path: Path): Path

    Resolve the fully qualified path for the given path.

    Resolve the fully qualified path for the given path.

    Definition Classes
    HadoopFileSystemLogStore → LogStore
  44. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  45. def toString(): String
    Definition Classes
    AnyRef → Any
  46. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  47. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  48. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  49. def write(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit

    Write the given actions to the given path with or without overwrite as indicated.

    Write the given actions to the given path with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Note: The default implementation ignores the hadoopConf parameter to provide the backward compatibility. Subclasses should override this method and use hadoopConf properly to support passing Hadoop file system configurations through DataFrame options.

    Definition Classes
    GCSLogStore → LogStore
  50. def write(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit

    Write the given actions to the given path with or without overwrite as indicated.

    Write the given actions to the given path with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Definition Classes
    GCSLogStore → LogStore
  51. def writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit

    An internal write implementation that uses FileSystem.rename().

    An internal write implementation that uses FileSystem.rename().

    This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.

    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore

Deprecated Value Members

  1. final def listFrom(path: String): Iterator[FileStatus]

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path.

    List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given path. The result should also be sorted by the file name.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  2. final def read(path: String): Seq[String]

    Load the given file and return a Seq of lines.

    Load the given file and return a Seq of lines. The line break will be removed from each line. This method will load the entire file into the memory. Call readAsIterator if possible as its implementation may be more efficient.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  3. final def readAsIterator(path: String): ClosableIterator[String]

    Load the given file and return an iterator of lines.

    Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls read to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  4. final def write(path: String, actions: Iterator[String]): Unit

    Write the given actions to the given path without overwriting any existing file.

    Write the given actions to the given path without overwriting any existing file. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.

    Definition Classes
    LogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

  5. def writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit

    An internal write implementation that uses FileSystem.rename().

    An internal write implementation that uses FileSystem.rename().

    This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.

    Attributes
    protected
    Definition Classes
    HadoopFileSystemLogStore
    Annotations
    @deprecated
    Deprecated

    call the method that asks for a Hadoop Configuration object instead

Inherited from Logging

Inherited from HadoopFileSystemLogStore

Inherited from org.apache.spark.sql.delta.storage.LogStore

Inherited from AnyRef

Inherited from Any

Ungrouped