Class LogStore


  • public abstract class LogStore
    extends Object
    :: DeveloperApi ::

    General interface for all critical file system operations required to read and write the Delta logs. The correctness is predicated on the atomicity and durability guarantees of the implementation of this interface. Specifically,

    1. Atomic visibility of files: If isPartialWriteVisible is false, any file written through this store must be made visible atomically. In other words, this should not generate partial files.
    2. Mutual exclusion: Only one writer must be able to create (or rename) a file at the final destination.
    3. Consistent listing: Once a file has been written in a directory, all future listings for that directory must return that file.

    All subclasses of this interface is required to have a constructor that takes Configuration as a single parameter. This constructor is used to dynamically create the LogStore.

    LogStore and its implementations are not meant for direct access but for configuration based on storage system. See [[https://docs.delta.io/latest/delta-storage.html]] for details.

    Since:
    1.0.0
    • Constructor Summary

      Constructors 
      Constructor Description
      LogStore​(org.apache.hadoop.conf.Configuration initHadoopConf)  
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.hadoop.conf.Configuration initHadoopConf()
      :: DeveloperApi :: Hadoop configuration that should only be used during initialization of LogStore.
      abstract Boolean isPartialWriteVisible​(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf)
      :: DeveloperApi :: Whether a partial write is visible for the underlying file system of `path`.
      abstract java.util.Iterator<org.apache.hadoop.fs.FileStatus> listFrom​(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf)
      :: DeveloperApi :: List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given `path`.
      abstract CloseableIterator<String> read​(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf)
      :: DeveloperApi :: Load the given file and return an `Iterator` of lines, with line breaks removed from each line.
      abstract org.apache.hadoop.fs.Path resolvePathOnPhysicalStorage​(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf)
      :: DeveloperApi :: Resolve the fully qualified path for the given `path`.
      abstract void write​(org.apache.hadoop.fs.Path path, java.util.Iterator<String> actions, Boolean overwrite, org.apache.hadoop.conf.Configuration hadoopConf)
      :: DeveloperApi :: Write the given `actions` to the given `path` with or without overwrite as indicated.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • LogStore

        public LogStore​(org.apache.hadoop.conf.Configuration initHadoopConf)
    • Method Detail

      • initHadoopConf

        public org.apache.hadoop.conf.Configuration initHadoopConf()
        :: DeveloperApi :: Hadoop configuration that should only be used during initialization of LogStore. Each method should use their `hadoopConf` parameter rather than this (potentially outdated) hadoop configuration.
      • read

        public abstract CloseableIterator<String> read​(org.apache.hadoop.fs.Path path,
                                                       org.apache.hadoop.conf.Configuration hadoopConf)
                                                throws java.io.IOException
        :: DeveloperApi :: Load the given file and return an `Iterator` of lines, with line breaks removed from each line. Callers of this function are responsible to close the iterator if they are done with it.
        Throws:
        java.io.IOException - if there's an issue resolving the FileSystem
        Since:
        1.0.0
      • write

        public abstract void write​(org.apache.hadoop.fs.Path path,
                                   java.util.Iterator<String> actions,
                                   Boolean overwrite,
                                   org.apache.hadoop.conf.Configuration hadoopConf)
                            throws java.io.IOException
        :: DeveloperApi :: Write the given `actions` to the given `path` with or without overwrite as indicated. Implementation must throw FileAlreadyExistsException exception if the file already exists and overwrite = false. Furthermore, if isPartialWriteVisible returns false, implementation must ensure that the entire file is made visible atomically, that is, it should not generate partial files.
        Throws:
        java.io.IOException - if there's an issue resolving the FileSystem
        java.nio.file.FileAlreadyExistsException - if the file already exists and overwrite is false
        Since:
        1.0.0
      • listFrom

        public abstract java.util.Iterator<org.apache.hadoop.fs.FileStatus> listFrom​(org.apache.hadoop.fs.Path path,
                                                                                     org.apache.hadoop.conf.Configuration hadoopConf)
                                                                              throws java.io.IOException
        :: DeveloperApi :: List the paths in the same directory that are lexicographically greater or equal to (UTF-8 sorting) the given `path`. The result should also be sorted by the file name.
        Throws:
        java.io.IOException - if there's an issue resolving the FileSystem
        java.nio.file.FileAlreadyExistsException - if path directory can't be found
        Since:
        1.0.0
      • resolvePathOnPhysicalStorage

        public abstract org.apache.hadoop.fs.Path resolvePathOnPhysicalStorage​(org.apache.hadoop.fs.Path path,
                                                                               org.apache.hadoop.conf.Configuration hadoopConf)
                                                                        throws java.io.IOException
        :: DeveloperApi :: Resolve the fully qualified path for the given `path`.
        Throws:
        java.io.IOException - if there's an issue resolving the FileSystem
        Since:
        1.0.0
      • isPartialWriteVisible

        public abstract Boolean isPartialWriteVisible​(org.apache.hadoop.fs.Path path,
                                                      org.apache.hadoop.conf.Configuration hadoopConf)
                                               throws java.io.IOException
        :: DeveloperApi :: Whether a partial write is visible for the underlying file system of `path`.
        Throws:
        java.io.IOException - if there's an issue resolving the FileSystem
        Since:
        1.0.0