Packages

class DeltaTable extends DeltaTableOperations with Serializable

Main class for programmatically interacting with Delta tables. You can create DeltaTable instances using the static methods.

DeltaTable.forPath(sparkSession, pathToTheDeltaTable)
Since

0.3.0

Linear Supertypes
Serializable, Serializable, DeltaTableOperations, AnalysisHelper, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaTable
  2. Serializable
  3. Serializable
  4. DeltaTableOperations
  5. AnalysisHelper
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def alias(alias: String): DeltaTable

    Apply an alias to the DeltaTable.

    Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

    Since

    0.3.0

  5. def as(alias: String): DeltaTable

    Apply an alias to the DeltaTable.

    Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

    Since

    0.3.0

  6. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  7. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  8. def delete(): Unit

    Delete data from the table.

    Delete data from the table.

    Since

    0.3.0

  9. def delete(condition: Column): Unit

    Delete data from the table that match the given condition.

    Delete data from the table that match the given condition.

    condition

    Boolean SQL expression

    Since

    0.3.0

  10. def delete(condition: String): Unit

    Delete data from the table that match the given condition.

    Delete data from the table that match the given condition.

    condition

    Boolean SQL expression

    Since

    0.3.0

  11. def deltaLog: DeltaLog
    Attributes
    protected
  12. def df: Dataset[Row]
    Attributes
    protected
  13. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  14. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  15. def executeDelete(condition: Option[Expression]): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  16. def executeGenerate(tblIdentifier: String, mode: String): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  17. def executeHistory(deltaLog: DeltaLog, limit: Option[Int] = None, tableId: Option[TableIdentifier] = None): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  18. def executeUpdate(set: Map[String, Column], condition: Option[Column]): Unit
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  19. def executeVacuum(deltaLog: DeltaLog, retentionHours: Option[Double], tableId: Option[TableIdentifier] = None): DataFrame
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  20. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  21. def generate(mode: String): Unit

    Generate a manifest for the given Delta Table

    Generate a manifest for the given Delta Table

    mode

    Specifies the mode for the generation of the manifest. The valid modes are as follows (not case sensitive):

    • "symlink_format_manifest" : This will generate manifests in symlink format for Presto and Athena read support. See the online documentation for more information.
    Since

    0.5.0

  22. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  23. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  24. def history(): DataFrame

    Get the information available commits on this table as a Spark DataFrame.

    Get the information available commits on this table as a Spark DataFrame. The information is in reverse chronological order.

    Since

    0.3.0

  25. def history(limit: Int): DataFrame

    Get the information of the latest limit commits on this table as a Spark DataFrame.

    Get the information of the latest limit commits on this table as a Spark DataFrame. The information is in reverse chronological order.

    limit

    The number of previous commands to get history for

    Since

    0.3.0

  26. def improveUnsupportedOpError(f: ⇒ Unit): Unit
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  27. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  28. def merge(source: DataFrame, condition: Column): DeltaMergeBuilder

    Merge data from the source DataFrame based on the given merge condition.

    Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

    See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

    Scala example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(Map(
       "value" -> "source.value"))
     .whenNotMatched
     .insertExpr(Map(
       "key" -> "source.key",
       "value" -> "source.value"))
     .execute()

    Java example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(
        new HashMap<String, String>() {{
          put("value" -> "source.value")
        }})
     .whenNotMatched
     .insertExpr(
        new HashMap<String, String>() {{
         put("key", "source.key");
         put("value", "source.value");
       }})
     .execute()
    source

    source Dataframe to be merged.

    condition

    boolean expression as a Column object

    Since

    0.3.0

  29. def merge(source: DataFrame, condition: String): DeltaMergeBuilder

    Merge data from the source DataFrame based on the given merge condition.

    Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

    See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

    Scala example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(Map(
       "value" -> "source.value"))
     .whenNotMatched
     .insertExpr(Map(
       "key" -> "source.key",
       "value" -> "source.value"))
     .execute()

    Java example to update a key-value Delta table with new key-values from a source DataFrame:

    deltaTable
     .as("target")
     .merge(
       source.as("source"),
       "target.key = source.key")
     .whenMatched
     .updateExpr(
        new HashMap<String, String>() {{
          put("value" -> "source.value");
        }})
     .whenNotMatched
     .insertExpr(
        new HashMap<String, String>() {{
         put("key", "source.key");
         put("value", "source.value");
       }})
     .execute();
    source

    source Dataframe to be merged.

    condition

    boolean expression as SQL formatted string

    Since

    0.3.0

  30. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  31. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  32. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  33. def resolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planProvidingAttrs: LogicalPlan): Seq[Expression]

    Resolve expressions using the attributes provided by planProvidingAttrs.

    Resolve expressions using the attributes provided by planProvidingAttrs. Throw an error if failing to resolve any expressions.

    Attributes
    protected
    Definition Classes
    AnalysisHelper
  34. def sparkSession: SparkSession
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  35. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  36. def toDF: Dataset[Row]

    Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.

    Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.

    Since

    0.3.0

  37. def toDataset(sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[Row]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  38. def toStrColumnMap(map: Map[String, String]): Map[String, Column]
    Attributes
    protected
    Definition Classes
    DeltaTableOperations
  39. def toString(): String
    Definition Classes
    AnyRef → Any
  40. def tryResolveReferences(sparkSession: SparkSession)(expr: Expression, planContainingExpr: LogicalPlan): Expression
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  41. def tryResolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planContainingExpr: LogicalPlan): Seq[Expression]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  42. def update(condition: Column, set: Map[String, Column]): Unit

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Java example to increment the column data.

    import org.apache.spark.sql.Column;
    import org.apache.spark.sql.functions;
    
    deltaTable.update(
      functions.col("date").gt("2018-01-01"),
      new HashMap<String, Column>() {{
        put("data", functions.col("data").plus(1));
      }}
    );
    condition

    boolean expression as Column object specifying which rows to update.

    set

    rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  43. def update(condition: Column, set: Map[String, Column]): Unit

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Update data from the table on the rows that match the given condition based on the rules defined by set.

    Scala example to increment the column data.

    import org.apache.spark.sql.functions._
    
    deltaTable.update(
      col("date") > "2018-01-01",
      Map("data" -> col("data") + 1))
    condition

    boolean expression as Column object specifying which rows to update.

    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  44. def update(set: Map[String, Column]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Java example to increment the column data.

    import org.apache.spark.sql.Column;
    import org.apache.spark.sql.functions;
    
    deltaTable.update(
      new HashMap<String, Column>() {{
        put("data", functions.col("data").plus(1));
      }}
    );
    set

    rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  45. def update(set: Map[String, Column]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Scala example to increment the column data.

    import org.apache.spark.sql.functions._
    
    deltaTable.update(Map("data" -> col("data") + 1))
    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

    Since

    0.3.0

  46. def updateExpr(condition: String, set: Map[String, String]): Unit

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Java example to increment the column data.

    deltaTable.update(
      "date > '2018-01-01'",
      new HashMap<String, String>() {{
        put("data", "data + 1");
      }}
    );
    condition

    boolean expression as SQL formatted string object specifying which rows to update.

    set

    rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  47. def updateExpr(condition: String, set: Map[String, String]): Unit

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Update data from the table on the rows that match the given condition, which performs the rules defined by set.

    Scala example to increment the column data.

    deltaTable.update(
      "date > '2018-01-01'",
      Map("data" -> "data + 1"))
    condition

    boolean expression as SQL formatted string object specifying which rows to update.

    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  48. def updateExpr(set: Map[String, String]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Java example to increment the column data.

    deltaTable.updateExpr(
      new HashMap<String, String>() {{
        put("data", "data + 1");
      }}
    );
    set

    rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  49. def updateExpr(set: Map[String, String]): Unit

    Update rows in the table based on the rules defined by set.

    Update rows in the table based on the rules defined by set.

    Scala example to increment the column data.

    deltaTable.updateExpr(Map("data" -> "data + 1")))
    set

    rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

    Since

    0.3.0

  50. def upgradeTableProtocol(readerVersion: Int, writerVersion: Int): Unit

    Updates the protocol version of the table to leverage new features.

    Updates the protocol version of the table to leverage new features. Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table. Upgrading the writer version will prevent older versions of Delta Lake to write to this table. The reader or writer version cannot be downgraded.

    See online documentation and Delta's protocol specification at PROTOCOL.md for more details.

    Since

    0.8.0

  51. def vacuum(): DataFrame

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

    note: This will use the default retention period of 7 days.

    Since

    0.3.0

  52. def vacuum(retentionHours: Double): DataFrame

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

    Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

    retentionHours

    The retention threshold in hours. Files required by the table for reading versions earlier than this will be preserved and the rest of them will be deleted.

    Since

    0.3.0

  53. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  54. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  55. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from DeltaTableOperations

Inherited from AnalysisHelper

Inherited from AnyRef

Inherited from Any

Ungrouped