class GCSLogStore extends HadoopFileSystemLogStore with Logging
:: Unstable ::
The LogStore implementation for GCS, which uses gcs-connector to provide the necessary atomic and durability guarantees:
1. Atomic Visibility: Read/read-after-metadata-update/delete are strongly consistent for GCS.
2. Consistent Listing: GCS guarantees strong consistency for both object and bucket listing operations. https://cloud.google.com/storage/docs/consistency
3. Mutual Exclusion: Preconditions are used to handle race conditions.
Regarding file creation, this implementation: - Opens a stream to write to GCS otherwise. - Throws FileAlreadyExistsException if file exists and overwrite is false. - Assumes file writing to be all-or-nothing, irrespective of overwrite option.
- Annotations
- @Unstable()
- Note
This class is not meant for direct access but for configuration based on storage system. See https://docs.delta.io/latest/delta-storage.html for details.
- Alphabetic
- By Inheritance
- GCSLogStore
- Logging
- HadoopFileSystemLogStore
- LogStore
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new GCSLogStore(sparkConf: SparkConf, initHadoopConf: Configuration)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
createTempPath(path: Path): Path
- Attributes
- protected
- Definition Classes
- HadoopFileSystemLogStore
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getHadoopConfiguration: Configuration
- Attributes
- protected
- Definition Classes
- HadoopFileSystemLogStore
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
invalidateCache(): Unit
Invalidate any caching that the implementation may be using
Invalidate any caching that the implementation may be using
- Definition Classes
- GCSLogStore → HadoopFileSystemLogStore → LogStore
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isPartialWriteVisible(path: Path, hadoopConf: Configuration): Boolean
Whether a partial write is visible when writing to
path
.Whether a partial write is visible when writing to
path
.As this depends on the underlying file system implementations, we require the input of
path
here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.
Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- GCSLogStore → LogStore
-
def
isPartialWriteVisible(path: Path): Boolean
Whether a partial write is visible when writing to
path
.Whether a partial write is visible when writing to
path
.As this depends on the underlying file system implementations, we require the input of
path
here in order to identify the underlying file system, even though in most cases a log store only deals with one file system.The default value is only provided here for legacy reasons, which will be removed. Any LogStore implementation should override this instead of relying on the default.
- Definition Classes
- GCSLogStore → LogStore
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
listFrom(path: Path, hadoopConf: Configuration): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
. The result should also be sorted by the file name.Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
listFrom(path: Path): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
. The result should also be sorted by the file name.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val preconditionFailedExceptionMessage: String
-
def
read(path: Path, hadoopConf: Configuration): Seq[String]
Load the given file and return a
Seq
of lines.Load the given file and return a
Seq
of lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIterator
if possible as its implementation may be more efficient.Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
read(path: Path): Seq[String]
Load the given file and return a
Seq
of lines.Load the given file and return a
Seq
of lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIterator
if possible as its implementation may be more efficient.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
readAsIterator(path: Path, hadoopConf: Configuration): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
read
to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.
Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
readAsIterator(path: Path): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
read
to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.Note: the returned ClosableIterator should be closed when it's no longer used to avoid resource leak.
- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
resolvePathOnPhysicalStorage(path: Path, hadoopConf: Configuration): Path
Resolve the fully qualified path for the given
path
.Resolve the fully qualified path for the given
path
.Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
def
resolvePathOnPhysicalStorage(path: Path): Path
Resolve the fully qualified path for the given
path
.Resolve the fully qualified path for the given
path
.- Definition Classes
- HadoopFileSystemLogStore → LogStore
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
write(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit
Write the given
actions
to the givenpath
with or without overwrite as indicated.Write the given
actions
to the givenpath
with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.Note: The default implementation ignores the
hadoopConf
parameter to provide the backward compatibility. Subclasses should override this method and usehadoopConf
properly to support passing Hadoop file system configurations through DataFrame options.- Definition Classes
- GCSLogStore → LogStore
-
def
write(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit
Write the given
actions
to the givenpath
with or without overwrite as indicated.Write the given
actions
to the givenpath
with or without overwrite as indicated. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.- Definition Classes
- GCSLogStore → LogStore
-
def
writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean, hadoopConf: Configuration): Unit
An internal write implementation that uses FileSystem.rename().
An internal write implementation that uses FileSystem.rename().
This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.
- Attributes
- protected
- Definition Classes
- HadoopFileSystemLogStore
Deprecated Value Members
-
final
def
listFrom(path: String): Iterator[FileStatus]
List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
.List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given
path
. The result should also be sorted by the file name.- Definition Classes
- LogStore
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
read(path: String): Seq[String]
Load the given file and return a
Seq
of lines.Load the given file and return a
Seq
of lines. The line break will be removed from each line. This method will load the entire file into the memory. CallreadAsIterator
if possible as its implementation may be more efficient.- Definition Classes
- LogStore
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
readAsIterator(path: String): ClosableIterator[String]
Load the given file and return an iterator of lines.
Load the given file and return an iterator of lines. The line break will be removed from each line. The default implementation calls
read
to load the entire file into the memory. An implementation should provide a more efficient approach if possible. For example, the file content can be loaded on demand.- Definition Classes
- LogStore
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
final
def
write(path: String, actions: Iterator[String]): Unit
Write the given
actions
to the givenpath
without overwriting any existing file.Write the given
actions
to the givenpath
without overwriting any existing file. Implementation must throw java.nio.file.FileAlreadyExistsException exception if the file already exists. Furthermore, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.- Definition Classes
- LogStore
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead
-
def
writeWithRename(path: Path, actions: Iterator[String], overwrite: Boolean = false): Unit
An internal write implementation that uses FileSystem.rename().
An internal write implementation that uses FileSystem.rename().
This implementation should only be used for the underlying file systems that support atomic renames, e.g., Azure is OK but HDFS is not.
- Attributes
- protected
- Definition Classes
- HadoopFileSystemLogStore
- Annotations
- @deprecated
- Deprecated
call the method that asks for a Hadoop Configuration object instead