class DeltaTable extends DeltaTableOperations with Serializable

Main class for programmatically interacting with Delta tables. You can create DeltaTable instances using the static methods.

DeltaTable.forPath(sparkSession, pathToTheDeltaTable)
Since

0.3.0

Linear Supertypes
Serializable, Serializable, DeltaTableOperations, AnalysisHelper, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaTable
  2. Serializable
  3. Serializable
  4. DeltaTableOperations
  5. AnalysisHelper
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def addFeatureSupport(featureName: String): Unit

    Modify the protocol to add a supported feature, and if the table does not support table features, upgrade the protocol automatically.

    Modify the protocol to add a supported feature, and if the table does not support table features, upgrade the protocol automatically. In such a case when the provided feature is writer-only, the table's writer version will be upgraded to 7, and when the provided feature is reader-writer, both reader and writer versions will be upgraded, to (3, 7).

    See online documentation and Delta's protocol specification at PROTOCOL.md for more details.

    Since

    2.3.0

  5. def alias(alias: String): DeltaTable

    Apply an alias to the DeltaTable.

    Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

    Since

    0.3.0

  6. def as(alias: String): DeltaTable

    Apply an alias to the DeltaTable.

    Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

    Since

    0.3.0

  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. def clone(target: String, isShallow: Boolean): DeltaTable

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    An example would be

    io.delta.tables.DeltaTable.clone(
      "/some/path/to/table",
      true)
    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    Since

    3.3.0

  9. def clone(target: String, isShallow: Boolean, replace: Boolean): DeltaTable

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    An example would be

    io.delta.tables.DeltaTable.clone(
      "/some/path/to/table",
      true,
      true)
    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    Since

    3.3.0

  10. def clone(target: String, isShallow: Boolean, replace: Boolean, properties: Map[String, String]): DeltaTable

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.

    Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.

    An example would be

    io.delta.tables.DeltaTable.clone(
     "/some/path/to/table",
     true,
     true,
     Map("foo" -> "bar"))
    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    properties

    The table properties to override in the clone.

    Since

    3.3.0

  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  12. def cloneAtTimestamp(timestamp: String, target: String, isShallow: Boolean): DeltaTable

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.

    An example would be

    io.delta.tables.DeltaTable.cloneAtTimestamp(
      "2019-01-01",
      "/some/path/to/table",
      true)
    timestamp

    The timestamp of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    Since

    3.3.0

  13. def cloneAtTimestamp(timestamp: String, target: String, isShallow: Boolean, replace: Boolean): DeltaTable

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.

    An example would be

    io.delta.tables.DeltaTable.cloneAtTimestamp(
      "2019-01-01",
      "/some/path/to/table",
      true,
      true)
    timestamp

    The timestamp of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    Since

    3.3.0

  14. def cloneAtTimestamp(timestamp: String, target: String, isShallow: Boolean, replace: Boolean, properties: Map[String, String]): DeltaTable

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing table's data and metadata at that timestamp.

    Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.

    Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.

    An example would be

    io.delta.tables.DeltaTable.cloneAtTimestamp(
      "2019-01-01",
      "/some/path/to/table",
      true,
      true,
      Map("foo" -> "bar"))
    timestamp

    The timestamp of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    properties

    The table properties to override in the clone.

    Since

    3.3.0

  15. def cloneAtVersion(version: Long, target: String, isShallow: Boolean): DeltaTable

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    An example would be

    io.delta.tables.DeltaTable.cloneAtVersion(
      5,
      "/some/path/to/table",
      true)
    version

    The version of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    Since

    3.3.0

  16. def cloneAtVersion(version: Long, target: String, isShallow: Boolean, replace: Boolean): DeltaTable

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    An example would be

    io.delta.tables.DeltaTable.cloneAtVersion(
      5,
      "/some/path/to/table",
      true,
      true)
    version

    The version of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    Since

    3.3.0

  17. def cloneAtVersion(version: Long, target: String, isShallow: Boolean, replace: Boolean, properties: Map[String, String]): DeltaTable

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    Clone a DeltaTable at a specific version to a given destination to mirror the existing table's data and metadata at that version.

    Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.

    An example would be

    io.delta.tables.DeltaTable.cloneAtVersion(
      5,
      "/some/path/to/table",
      true,
      true,
      Map("foo" -> "bar"))
    version

    The version of this table to clone from.

    target

    The path or table name to create the clone.

    isShallow

    Whether to create a shallow clone or a deep clone.

    replace

    Whether to replace the destination with the clone command.

    properties

    The table properties to override in the clone.

    Since

    3.3.0

  18. def delete(): Unit

    Delete data from the table.

    Delete data from the table.

    Since

    0.3.0

  19. def delete(condition: Column): Unit

    Delete data from the table that match the given condition.

    Delete data from the table that match the given condition.

    condition

    Boolean SQL expression

    Since

    0.3.0

  20. def delete(condition: String): Unit

    Delete data from the table that match the given condition.

    Delete data from the table that match the given condition.

    condition

    Boolean SQL expression

    Since

    0.3.0

  21. def deltaLog: DeltaLog
    Attributes
    protected
  22. def detail(): DataFrame

    :: Evolving ::

    :: Evolving ::

    Get the details of a Delta table such as the format, name, and size.

    Annotations
    @Evolving()
    Since

    2.1.0

  23. def df: Dataset[Row]
    Attributes
    protected
  24. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  25. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  26. def executeClone(table: DeltaTableV2, target: String, isShallow: Boolean, replace: Boolean, properties: Map[String, String], versionAsOf: Option[Long], timestampAsOf: Option[String]): DeltaTable
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  27. def executeDelete(condition: Option[Expression]): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  28. def executeDetails(path: String, tableIdentifier: Option[TableIdentifier]): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  29. def executeGenerate(path: String, tableIdentifier: Option[TableIdentifier], mode: String): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  30. def executeHistory(deltaLog: DeltaLog, limit: Option[Int], tableId: Option[TableIdentifier]): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  31. def executeRestore(table: DeltaTableV2, versionAsOf: Option[Long], timestampAsOf: Option[String]): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  32. def executeUpdate(set: Map[String, Column], condition: Option[Column]): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  33. def executeVacuum(deltaLog: DeltaLog, retentionHours: Option[Double], tableId: Option[TableIdentifier]): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  34. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  35. def generate(mode: String): Unit

    Generate a manifest for the given Delta Table

    Generate a manifest for the given Delta Table

    mode

    Specifies the mode for the generation of the manifest. The valid modes are as follows (not case sensitive):

    • "symlink_format_manifest" : This will generate manifests in symlink format for Presto and Athena read support. See the online documentation for more information.
    Since

    0.5.0

  36. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  37. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  38. def history(): DataFrame

    Get the information available commits on this table as a Spark DataFrame.

    Get the information available commits on this table as a Spark DataFrame. The information is in reverse chronological order.

    Since

    0.3.0

  39. def history(limit: Int): DataFrame

    Get the information of the latest limit commits on this table as a Spark DataFrame.

    Get the information of the latest limit commits on this table as a Spark DataFrame. The information is in reverse chronological order.

    limit

    The number of previous commands to get history for

    Since

    0.3.0

  40. def improveUnsupportedOpError(f: ⇒ Unit): Unit
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  41. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  42. def merge(source: DataFrame, condition: Column): DeltaMergeBuilder

    Merge data from the source DataFrame based on the given merge condition.

    Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

    See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

    Scala example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(Map(
       "value" -> "source.value"))
     .whenNotMatched
     .insertExpr(Map(
       "key" -> "source.key",
       "value" -> "source.value"))
     .execute()

    Java example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(
        new HashMap<String, String>() {{
          put("value" -> "source.value")
        }})
     .whenNotMatched
     .insertExpr(
        new HashMap<String, String>() {{
         put("key", "source.key");
         put("value", "source.value");
       }})
     .execute()
    source

    source Dataframe to be merged.

    condition

    boolean expression as a Column object

    Since

    0.3.0

  43. def merge(source: DataFrame, condition: String): DeltaMergeBuilder

    Merge data from the source DataFrame based on the given merge condition.

    Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

    See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

    Scala example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(Map(
       "value" -> "source.value"))
     .whenNotMatched
     .insertExpr(Map(
       "key" -> "source.key",
       "value" -> "source.value"))
     .execute()

    Java example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(
        new HashMap<String, String>() {{
          put("value" -> "source.value");
        }})
     .whenNotMatched
     .insertExpr(
        new HashMap<String, String>() {{
         put("key", "source.key");
         put("value", "source.value");
       }})
     .execute();
    source

    source Dataframe to be merged.

    condition

    boolean expression as SQL formatted string

    Since

    0.3.0

  44. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  45. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  46. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  47. def optimize(): DeltaOptimizeBuilder

    Optimize the data layout of the table.

    Optimize the data layout of the table. This returns a DeltaOptimizeBuilder object that can be used to specify the partition filter to limit the scope of optimize and also execute different optimization techniques such as file compaction or order data using Z-Order curves.

    See the DeltaOptimizeBuilder for a full description of this operation.

    Scala example to run file compaction on a subset of partitions in the table:

    deltaTable
     .optimize()
     .where("date='2021-11-18'")
     .executeCompaction();
    Since

    2.0.0

  48. def resolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planProvidingAttrs: LogicalPlan): Seq[Expression]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  49. def restoreToTimestamp(timestamp: String): DataFrame

    Restore the DeltaTable to an older version of the table specified by a timestamp.

    Restore the DeltaTable to an older version of the table specified by a timestamp.

    Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss

    An example would be

    io.delta.tables.DeltaTable.restoreToTimestamp("2019-01-01")
    Since

    1.2.0

  50. def restoreToVersion(version: Long): DataFrame

    Restore the DeltaTable to an older version of the table specified by version number.

    Restore the DeltaTable to an older version of the table specified by version number.

    An example would be

    io.delta.tables.DeltaTable.restoreToVersion(7)
    Since

    1.2.0

  51. def sparkSession: SparkSession
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  52. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  53. def toDF: Dataset[Row]

    Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.

    Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.

    Since

    0.3.0

  54. def toDataset(sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[Row]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  55. def toStrColumnMap(map: Map[String, String]): Map[String, Column]
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  56. def toString(): String
    Definition Classes
    AnyRef → Any
  57. def tryResolveReferences(sparkSession: SparkSession)(expr: Expression, planContainingExpr: LogicalPlan): Expression
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  58. def tryResolveReferencesForExpressions(sparkSession: SparkSession)(exprs: Seq[Expression], plansProvidingAttrs: Seq[LogicalPlan]): Seq[Expression]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  59. def tryResolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planContainingExpr: LogicalPlan): Seq[Expression]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  60. def update(condition: Column, set: Map[String, Column]): Unit

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Java example to increment the column data.

    import org.apache.spark.sql.Column;
    import org.apache.spark.sql.functions;
    
    deltaTable.update(
      functions.col("date").gt("2018-01-01"),
      new HashMap<String, Column>() {{
        put("data", functions.col("data").plus(1));
      }}
    );
    condition

    boolean expression as Column object specifying which rows to update.

    set

    rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  61. def update(condition: Column, set: Map[String, Column]): Unit

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Scala example to increment the column data.

    import org.apache.spark.sql.functions._
    
    deltaTable.update(
      col("date") > "2018-01-01",
      Map("data" -> col("data") + 1))
    condition

    boolean expression as Column object specifying which rows to update.

    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  62. def update(set: Map[String, Column]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Java example to increment the column data.

    import org.apache.spark.sql.Column;
    import org.apache.spark.sql.functions;
    
    deltaTable.update(
      new HashMap<String, Column>() {{
        put("data", functions.col("data").plus(1));
      }}
    );
    set

    rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  63. def update(set: Map[String, Column]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Scala example to increment the column data.

    import org.apache.spark.sql.functions._
    
    deltaTable.update(Map("data" -> col("data") + 1))
    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  64. def updateExpr(condition: String, set: Map[String, String]): Unit

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Java example to increment the column data.

    deltaTable.update(
      "date > '2018-01-01'",
      new HashMap<String, String>() {{
        put("data", "data + 1");
      }}
    );
    condition

    boolean expression as SQL formatted string object specifying which rows to update.

    set

    rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  65. def updateExpr(condition: String, set: Map[String, String]): Unit

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Scala example to increment the column data.

    deltaTable.update(
      "date > '2018-01-01'",
      Map("data" -> "data + 1"))
    condition

    boolean expression as SQL formatted string object specifying which rows to update.

    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  66. def updateExpr(set: Map[String, String]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Java example to increment the column data.

    deltaTable.updateExpr(
      new HashMap<String, String>() {{
        put("data", "data + 1");
      }}
    );
    set

    rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  67. def updateExpr(set: Map[String, String]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Scala example to increment the column data.

    deltaTable.updateExpr(Map("data" -> "data + 1")))
    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  68. def upgradeTableProtocol(readerVersion: Int, writerVersion: Int): Unit

    Updates the protocol version of the table to leverage new features.

    Updates the protocol version of the table to leverage new features. Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table. Upgrading the writer version will prevent older versions of Delta Lake to write to this table. The reader or writer version cannot be downgraded.

    See online documentation and Delta's protocol specification at PROTOCOL.md for more details.

    Since

    0.8.0

  69. def vacuum(): DataFrame

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

    note: This will use the default retention period of 7 days.

    Since

    0.3.0

  70. def vacuum(retentionHours: Double): DataFrame

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

    retentionHours

    The retention threshold in hours. Files required by the table for reading versions earlier than this will be preserved and the rest of them will be deleted.

    Since

    0.3.0

  71. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  72. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  73. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from DeltaTableOperations

Inherited from AnalysisHelper

Inherited from AnyRef

Inherited from Any

Ungrouped