Delta Lake 2.1.0 ScalaDoc - io.delta.tables.DeltaTable

final def !=(arg0: Any): Boolean

Definition Classes: AnyRef → Any

final def ##(): Int

Definition Classes: AnyRef → Any

final def ==(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def alias(alias: String): DeltaTable

Apply an alias to the DeltaTable.

Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

Since: 0.3.0

def as(alias: String): DeltaTable

Apply an alias to the DeltaTable.

Apply an alias to the DeltaTable. This is similar to Dataset.as(alias) or SQL tableName AS alias.

Since: 0.3.0

final def asInstanceOf[T0]: T0

Definition Classes: Any

def clone(): AnyRef

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

def delete(): Unit

Delete data from the table.

Since: 0.3.0

def delete(condition: Column): Unit

Delete data from the table that match the given condition.

condition: Boolean SQL expression

Since: 0.3.0

def delete(condition: String): Unit

Delete data from the table that match the given condition.

condition: Boolean SQL expression

Since: 0.3.0

def deltaLog: DeltaLog

Attributes: protected

def detail(): DataFrame

:: Evolving ::

Get the details of a Delta table such as the format, name, and size.

Annotations: @Evolving()
Since: 2.1.0

def df: Dataset[Row]

Attributes: protected

final def eq(arg0: AnyRef): Boolean

Definition Classes: AnyRef

def equals(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def executeDelete(condition: Option[Expression]): Unit

Attributes: protected
Definition Classes: DeltaTableOperations

def executeDetails(path: String, tableIdentifier: Option[TableIdentifier]): DataFrame

Attributes: protected
Definition Classes: DeltaTableOperations

def executeGenerate(tblIdentifier: String, mode: String): Unit

Attributes: protected
Definition Classes: DeltaTableOperations

def executeHistory(deltaLog: DeltaLog, limit: Option[Int] = None, tableId: Option[TableIdentifier] = None): DataFrame

Attributes: protected
Definition Classes: DeltaTableOperations

def executeRestore(table: DeltaTableV2, versionAsOf: Option[Long], timestampAsOf: Option[String]): DataFrame

Attributes: protected
Definition Classes: DeltaTableOperations

def executeUpdate(set: Map[String, Column], condition: Option[Column]): Unit

Attributes: protected
Definition Classes: DeltaTableOperations

def executeVacuum(deltaLog: DeltaLog, retentionHours: Option[Double], tableId: Option[TableIdentifier] = None): DataFrame

Attributes: protected
Definition Classes: DeltaTableOperations

def finalize(): Unit

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( classOf[java.lang.Throwable] )

def generate(mode: String): Unit

Generate a manifest for the given Delta Table

mode

Specifies the mode for the generation of the manifest. The valid modes are as follows (not case sensitive):

"symlink_format_manifest" : This will generate manifests in symlink format for Presto and Athena read support. See the online documentation for more information.

Since: 0.5.0

final def getClass(): Class[_]

Definition Classes: AnyRef → Any
Annotations: @native()

def hashCode(): Int

Definition Classes: AnyRef → Any
Annotations: @native()

def history(): DataFrame

Get the information available commits on this table as a Spark DataFrame.

Get the information available commits on this table as a Spark DataFrame. The information is in reverse chronological order.

Since: 0.3.0

def history(limit: Int): DataFrame

Get the information of the latest limit commits on this table as a Spark DataFrame.

Get the information of the latest limit commits on this table as a Spark DataFrame. The information is in reverse chronological order.

limit: The number of previous commands to get history for

Since: 0.3.0

def improveUnsupportedOpError(f: ⇒ Unit): Unit

Attributes: protected
Definition Classes: AnalysisHelper

final def isInstanceOf[T0]: Boolean

Definition Classes: Any

def merge(source: DataFrame, condition: Column): DeltaMergeBuilder

Merge data from the source DataFrame based on the given merge condition.

Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

Scala example to update a key-value Delta table with new key-values from a source DataFrame:

deltaTable
 .as("target")
 .merge(
   source.as("source"),
   "target.key = source.key")
 .whenMatched
 .updateExpr(Map(
   "value" -> "source.value"))
 .whenNotMatched
 .insertExpr(Map(
   "key" -> "source.key",
   "value" -> "source.value"))
 .execute()

Java example to update a key-value Delta table with new key-values from a source DataFrame:

deltaTable
 .as("target")
 .merge(
   source.as("source"),
   "target.key = source.key")
 .whenMatched
 .updateExpr(
    new HashMap<String, String>() {{
      put("value" -> "source.value")
    }})
 .whenNotMatched
 .insertExpr(
    new HashMap<String, String>() {{
     put("key", "source.key");
     put("value", "source.value");
   }})
 .execute()

source: source Dataframe to be merged.
condition: boolean expression as a Column object

Since: 0.3.0

def merge(source: DataFrame, condition: String): DeltaMergeBuilder

Merge data from the source DataFrame based on the given merge condition.

Merge data from the source DataFrame based on the given merge condition. This returns a DeltaMergeBuilder object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.

See the DeltaMergeBuilder for a full description of this operation and what combinations of update, delete and insert operations are allowed.

Scala example to update a key-value Delta table with new key-values from a source DataFrame:

deltaTable
 .as("target")
 .merge(
   source.as("source"),
   "target.key = source.key")
 .whenMatched
 .updateExpr(Map(
   "value" -> "source.value"))
 .whenNotMatched
 .insertExpr(Map(
   "key" -> "source.key",
   "value" -> "source.value"))
 .execute()

Java example to update a key-value Delta table with new key-values from a source DataFrame:

deltaTable
 .as("target")
 .merge(
   source.as("source"),
   "target.key = source.key")
 .whenMatched
 .updateExpr(
    new HashMap<String, String>() {{
      put("value" -> "source.value");
    }})
 .whenNotMatched
 .insertExpr(
    new HashMap<String, String>() {{
     put("key", "source.key");
     put("value", "source.value");
   }})
 .execute();

source: source Dataframe to be merged.
condition: boolean expression as SQL formatted string

Since: 0.3.0

final def ne(arg0: AnyRef): Boolean

Definition Classes: AnyRef

final def notify(): Unit

Definition Classes: AnyRef
Annotations: @native()

final def notifyAll(): Unit

Definition Classes: AnyRef
Annotations: @native()

def optimize(): DeltaOptimizeBuilder

Optimize the data layout of the table.

Optimize the data layout of the table. This returns a DeltaOptimizeBuilder object that can be used to specify the partition filter to limit the scope of optimize and also execute different optimization techniques such as file compaction or order data using Z-Order curves.

See the DeltaOptimizeBuilder for a full description of this operation.

Scala example to run file compaction on a subset of partitions in the table:

deltaTable
 .optimize()
 .where("date='2021-11-18'")
 .executeCompaction();

Since: 2.0.0

def resolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planProvidingAttrs: LogicalPlan): Seq[Expression]

Resolve expressions using the attributes provided by planProvidingAttrs.

Resolve expressions using the attributes provided by planProvidingAttrs. Throw an error if failing to resolve any expressions.

Attributes: protected
Definition Classes: AnalysisHelper

def restoreToTimestamp(timestamp: String): DataFrame

Restore the DeltaTable to an older version of the table specified by a timestamp.

Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss

An example would be

io.delta.tables.DeltaTable.restoreToTimestamp("2019-01-01")

Since: 1.2.0

def restoreToVersion(version: Long): DataFrame

Restore the DeltaTable to an older version of the table specified by version number.

An example would be

io.delta.tables.DeltaTable.restoreToVersion(7)

Since: 1.2.0

def sparkSession: SparkSession

Attributes: protected
Definition Classes: DeltaTableOperations

final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes: AnyRef

def toDF: Dataset[Row]

Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.

Since: 0.3.0

def toDataset(sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[Row]

Attributes: protected
Definition Classes: AnalysisHelper

def toStrColumnMap(map: Map[String, String]): Map[String, Column]

Attributes: protected
Definition Classes: DeltaTableOperations

def toString(): String

Definition Classes: AnyRef → Any

def tryResolveReferences(sparkSession: SparkSession)(expr: Expression, planContainingExpr: LogicalPlan): Expression

Attributes: protected
Definition Classes: AnalysisHelper

def tryResolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planContainingExpr: LogicalPlan): Seq[Expression]

Attributes: protected
Definition Classes: AnalysisHelper

def update(condition: Column, set: Map[String, Column]): Unit

Update data from the table on the rows that match the given condition based on the rules defined by set.

Java example to increment the column data.

import org.apache.spark.sql.Column;
import org.apache.spark.sql.functions;

deltaTable.update(
  functions.col("date").gt("2018-01-01"),
  new HashMap<String, Column>() {{
    put("data", functions.col("data").plus(1));
  }}
);

condition: boolean expression as Column object specifying which rows to update.
set: rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

Since: 0.3.0

def update(condition: Column, set: Map[String, Column]): Unit

Update data from the table on the rows that match the given condition based on the rules defined by set.

Scala example to increment the column data.

import org.apache.spark.sql.functions._

deltaTable.update(
  col("date") > "2018-01-01",
  Map("data" -> col("data") + 1))

condition: boolean expression as Column object specifying which rows to update.
set: rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

Since: 0.3.0

def update(set: Map[String, Column]): Unit

Update rows in the table based on the rules defined by set.

Java example to increment the column data.

import org.apache.spark.sql.Column;
import org.apache.spark.sql.functions;

deltaTable.update(
  new HashMap<String, Column>() {{
    put("data", functions.col("data").plus(1));
  }}
);

set: rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.

Since: 0.3.0

def update(set: Map[String, Column]): Unit

Update rows in the table based on the rules defined by set.

Scala example to increment the column data.

import org.apache.spark.sql.functions._

deltaTable.update(Map("data" -> col("data") + 1))

set: rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.

Since: 0.3.0

def updateExpr(condition: String, set: Map[String, String]): Unit

Update data from the table on the rows that match the given condition, which performs the rules defined by set.

Java example to increment the column data.

deltaTable.update(
  "date > '2018-01-01'",
  new HashMap<String, String>() {{
    put("data", "data + 1");
  }}
);

condition: boolean expression as SQL formatted string object specifying which rows to update.
set: rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

Since: 0.3.0

def updateExpr(condition: String, set: Map[String, String]): Unit

Update data from the table on the rows that match the given condition, which performs the rules defined by set.

Scala example to increment the column data.

deltaTable.update(
  "date > '2018-01-01'",
  Map("data" -> "data + 1"))

condition: boolean expression as SQL formatted string object specifying which rows to update.
set: rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

Since: 0.3.0

def updateExpr(set: Map[String, String]): Unit

Update rows in the table based on the rules defined by set.

Java example to increment the column data.

deltaTable.updateExpr(
  new HashMap<String, String>() {{
    put("data", "data + 1");
  }}
);

set: rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.

Since: 0.3.0

def updateExpr(set: Map[String, String]): Unit

Update rows in the table based on the rules defined by set.

Scala example to increment the column data.

deltaTable.updateExpr(Map("data" -> "data + 1")))

set: rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.

Since: 0.3.0

def upgradeTableProtocol(readerVersion: Int, writerVersion: Int): Unit

Updates the protocol version of the table to leverage new features.

Updates the protocol version of the table to leverage new features. Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table. Upgrading the writer version will prevent older versions of Delta Lake to write to this table. The reader or writer version cannot be downgraded.

See online documentation and Delta's protocol specification at PROTOCOL.md for more details.

Since: 0.8.0

def vacuum(): DataFrame

Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

note: This will use the default retention period of 7 days.

Since: 0.3.0

def vacuum(retentionHours: Double): DataFrame

Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.

Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.

retentionHours: The retention threshold in hours. Files required by the table for reading versions earlier than this will be preserved and the rest of them will be deleted.

Since: 0.3.0

final def wait(): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long, arg1: Int): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long): Unit

Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

Packages

DeltaTable

Companion object DeltaTable

class DeltaTable extends DeltaTableOperations with Serializable

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from DeltaTableOperations

Inherited from AnalysisHelper

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

DeltaTable 

Companion object DeltaTable

class DeltaTable extends DeltaTableOperations with Serializable

Value Members

Inherited from Serializable

Inherited from Serializable

Inherited from DeltaTableOperations

Inherited from AnalysisHelper

Inherited from AnyRef

Inherited from Any

Ungrouped

DeltaTable