Class DeltaTable
- Object
-
- io.delta.tables.DeltaTable
-
- All Implemented Interfaces:
io.delta.tables.execution.DeltaTableOperations
,java.io.Serializable
,org.apache.spark.sql.delta.util.AnalysisHelper
,scala.Serializable
public class DeltaTable extends Object implements io.delta.tables.execution.DeltaTableOperations, scala.Serializable
Main class for programmatically interacting with Delta tables. You can create DeltaTable instances using the static methods.DeltaTable.forPath(sparkSession, pathToTheDeltaTable)
- Since:
- 0.3.0
- See Also:
- Serialized Form
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description DeltaTable
alias(String alias)
Apply an alias to the DeltaTable.DeltaTable
as(String alias)
Apply an alias to the DeltaTable.static DeltaColumnBuilder
columnBuilder(String colName)
:: Evolving ::static DeltaColumnBuilder
columnBuilder(org.apache.spark.sql.SparkSession spark, String colName)
:: Evolving ::static DeltaTable
convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier)
Create a DeltaTable from the given parquet table.static DeltaTable
convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, String partitionSchema)
Create a DeltaTable from the given parquet table and partition schema.static DeltaTable
convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, org.apache.spark.sql.types.StructType partitionSchema)
Create a DeltaTable from the given parquet table and partition schema.static DeltaTableBuilder
create()
:: Evolving ::static DeltaTableBuilder
create(org.apache.spark.sql.SparkSession spark)
:: Evolving ::static DeltaTableBuilder
createIfNotExists()
:: Evolving ::static DeltaTableBuilder
createIfNotExists(org.apache.spark.sql.SparkSession spark)
:: Evolving ::static DeltaTableBuilder
createOrReplace()
:: Evolving ::static DeltaTableBuilder
createOrReplace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::void
delete()
Delete data from the table.void
delete(String condition)
Delete data from the table that match the givencondition
.void
delete(org.apache.spark.sql.Column condition)
Delete data from the table that match the givencondition
.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
detail()
:: Evolving ::static DeltaTable
forName(String tableOrViewName)
Instantiate aDeltaTable
object using the given table or view name.static DeltaTable
forName(org.apache.spark.sql.SparkSession sparkSession, String tableName)
Instantiate aDeltaTable
object using the given table or view name using the given SparkSession.static DeltaTable
forPath(String path)
Instantiate aDeltaTable
object representing the data at the given path, If the given path is invalid (i.e.static DeltaTable
forPath(org.apache.spark.sql.SparkSession sparkSession, String path)
Instantiate aDeltaTable
object representing the data at the given path, If the given path is invalid (i.e.void
generate(String mode)
Generate a manifest for the given Delta Tableorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
history()
Get the information available commits on this table as a Spark DataFrame.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
history(int limit)
Get the information of the latestlimit
commits on this table as a Spark DataFrame.static boolean
isDeltaTable(String identifier)
Check if the providedidentifier
string, in this case a file path, is the root of a Delta table.static boolean
isDeltaTable(org.apache.spark.sql.SparkSession sparkSession, String identifier)
Check if the providedidentifier
string, in this case a file path, is the root of a Delta table using the given SparkSession.DeltaMergeBuilder
merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, String condition)
Merge data from thesource
DataFrame based on the given mergecondition
.DeltaMergeBuilder
merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column condition)
Merge data from thesource
DataFrame based on the given mergecondition
.DeltaOptimizeBuilder
optimize()
Optimize the data layout of the table.static DeltaTableBuilder
replace()
:: Evolving ::static DeltaTableBuilder
replace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
restoreToTimestamp(String timestamp)
Restore the DeltaTable to an older version of the table specified by a timestamp.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
restoreToVersion(long version)
Restore the DeltaTable to an older version of the table specified by version number.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDF()
Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.void
update(java.util.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined byset
.void
update(org.apache.spark.sql.Column condition, java.util.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the givencondition
based on the rules defined byset
.void
update(org.apache.spark.sql.Column condition, scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the givencondition
based on the rules defined byset
.void
update(scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined byset
.void
updateExpr(String condition, java.util.Map<String,String> set)
Update data from the table on the rows that match the givencondition
, which performs the rules defined byset
.void
updateExpr(String condition, scala.collection.immutable.Map<String,String> set)
Update data from the table on the rows that match the givencondition
, which performs the rules defined byset
.void
updateExpr(java.util.Map<String,String> set)
Update rows in the table based on the rules defined byset
.void
updateExpr(scala.collection.immutable.Map<String,String> set)
Update rows in the table based on the rules defined byset
.void
upgradeTableProtocol(int readerVersion, int writerVersion)
Updates the protocol version of the table to leverage new features.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
vacuum()
Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
vacuum(double retentionHours)
Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold.-
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
-
-
Method Detail
-
convertToDelta
public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, org.apache.spark.sql.types.StructType partitionSchema)
Create a DeltaTable from the given parquet table and partition schema. Takes an existing parquet table and constructs a delta transaction log in the base path of that table.Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An example usage would be
io.delta.tables.DeltaTable.convertToDelta( spark, "parquet.`/path`", new StructType().add(StructField("key1", LongType)).add(StructField("key2", StringType)))
- Parameters:
spark
- (undocumented)identifier
- (undocumented)partitionSchema
- (undocumented)- Since:
- 0.4.0
-
convertToDelta
public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, String partitionSchema)
Create a DeltaTable from the given parquet table and partition schema. Takes an existing parquet table and constructs a delta transaction log in the base path of that table.Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An example usage would be
io.delta.tables.DeltaTable.convertToDelta( spark, "parquet.`/path`", "key1 long, key2 string")
- Parameters:
spark
- (undocumented)identifier
- (undocumented)partitionSchema
- (undocumented)- Since:
- 0.4.0
-
convertToDelta
public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier)
Create a DeltaTable from the given parquet table. Takes an existing parquet table and constructs a delta transaction log in the base path of the table.Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An Example would be
io.delta.tables.DeltaTable.convertToDelta( spark, "parquet.`/path`"
- Parameters:
spark
- (undocumented)identifier
- (undocumented)- Since:
- 0.4.0
-
forPath
public static DeltaTable forPath(String path)
Instantiate aDeltaTable
object representing the data at the given path, If the given path is invalid (i.e. either no table exists or an existing table is not a Delta table), it throws anot a Delta table
error.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Parameters:
path
- (undocumented)- Since:
- 0.3.0
-
forPath
public static DeltaTable forPath(org.apache.spark.sql.SparkSession sparkSession, String path)
Instantiate aDeltaTable
object representing the data at the given path, If the given path is invalid (i.e. either no table exists or an existing table is not a Delta table), it throws anot a Delta table
error.- Parameters:
sparkSession
- (undocumented)path
- (undocumented)- Since:
- 0.3.0
-
forName
public static DeltaTable forName(String tableOrViewName)
Instantiate aDeltaTable
object using the given table or view name. If the given tableOrViewName is invalid (i.e. either no table exists or an existing table is not a Delta table), it throws anot a Delta table
error.The given tableOrViewName can also be the absolute path of a delta datasource (i.e. delta.
path
), If so, instantiate aDeltaTable
object representing the data at the given path (consistent with theforPath(java.lang.String)
).Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Parameters:
tableOrViewName
- (undocumented)
-
forName
public static DeltaTable forName(org.apache.spark.sql.SparkSession sparkSession, String tableName)
Instantiate aDeltaTable
object using the given table or view name using the given SparkSession. If the given tableOrViewName is invalid (i.e. either no table exists or an existing table is not a Delta table), it throws anot a Delta table
error.The given tableOrViewName can also be the absolute path of a delta datasource (i.e. delta.
path
), If so, instantiate aDeltaTable
object representing the data at the given path (consistent with theforPath(java.lang.String)
).- Parameters:
sparkSession
- (undocumented)tableName
- (undocumented)
-
isDeltaTable
public static boolean isDeltaTable(org.apache.spark.sql.SparkSession sparkSession, String identifier)
Check if the providedidentifier
string, in this case a file path, is the root of a Delta table using the given SparkSession.An example would be
DeltaTable.isDeltaTable(spark, "path/to/table")
- Parameters:
sparkSession
- (undocumented)identifier
- (undocumented)- Since:
- 0.4.0
-
isDeltaTable
public static boolean isDeltaTable(String identifier)
Check if the providedidentifier
string, in this case a file path, is the root of a Delta table.Note: This uses the active SparkSession in the current thread to search for the table. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.An example would be
DeltaTable.isDeltaTable(spark, "/path/to/table")
- Parameters:
identifier
- (undocumented)- Since:
- 0.4.0
-
create
public static DeltaTableBuilder create()
:: Evolving ::Return an instance of
DeltaTableBuilder
to create a Delta table, error if the table exists (the same as SQLCREATE TABLE
). Refer toDeltaTableBuilder
for more details.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Since:
- 1.0.0
-
create
public static DeltaTableBuilder create(org.apache.spark.sql.SparkSession spark)
:: Evolving ::Return an instance of
DeltaTableBuilder
to create a Delta table, error if the table exists (the same as SQLCREATE TABLE
). Refer toDeltaTableBuilder
for more details.- Parameters:
spark
- sparkSession sparkSession passed by the user- Since:
- 1.0.0
-
createIfNotExists
public static DeltaTableBuilder createIfNotExists()
:: Evolving ::Return an instance of
DeltaTableBuilder
to create a Delta table, if it does not exists (the same as SQLCREATE TABLE IF NOT EXISTS
). Refer toDeltaTableBuilder
for more details.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Since:
- 1.0.0
-
createIfNotExists
public static DeltaTableBuilder createIfNotExists(org.apache.spark.sql.SparkSession spark)
:: Evolving ::Return an instance of
DeltaTableBuilder
to create a Delta table, if it does not exists (the same as SQLCREATE TABLE IF NOT EXISTS
). Refer toDeltaTableBuilder
for more details.- Parameters:
spark
- sparkSession sparkSession passed by the user- Since:
- 1.0.0
-
replace
public static DeltaTableBuilder replace()
:: Evolving ::Return an instance of
DeltaTableBuilder
to replace a Delta table, error if the table doesn't exist (the same as SQLREPLACE TABLE
) Refer toDeltaTableBuilder
for more details.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Since:
- 1.0.0
-
replace
public static DeltaTableBuilder replace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::Return an instance of
DeltaTableBuilder
to replace a Delta table, error if the table doesn't exist (the same as SQLREPLACE TABLE
) Refer toDeltaTableBuilder
for more details.- Parameters:
spark
- sparkSession sparkSession passed by the user- Since:
- 1.0.0
-
createOrReplace
public static DeltaTableBuilder createOrReplace()
:: Evolving ::Return an instance of
DeltaTableBuilder
to replace a Delta table or create table if not exists (the same as SQLCREATE OR REPLACE TABLE
) Refer toDeltaTableBuilder
for more details.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Since:
- 1.0.0
-
createOrReplace
public static DeltaTableBuilder createOrReplace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::Return an instance of
DeltaTableBuilder
to replace a Delta table, or create table if not exists (the same as SQLCREATE OR REPLACE TABLE
) Refer toDeltaTableBuilder
for more details.- Parameters:
spark
- sparkSession sparkSession passed by the user.- Since:
- 1.0.0
-
columnBuilder
public static DeltaColumnBuilder columnBuilder(String colName)
:: Evolving ::Return an instance of
DeltaColumnBuilder
to specify a column. Refer toDeltaTableBuilder
for examples andDeltaColumnBuilder
detailed APIs.Note: This uses the active SparkSession in the current thread to read the table data. Hence, this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.- Parameters:
colName
- string the column name- Since:
- 1.0.0
-
columnBuilder
public static DeltaColumnBuilder columnBuilder(org.apache.spark.sql.SparkSession spark, String colName)
:: Evolving ::Return an instance of
DeltaColumnBuilder
to specify a column. Refer toDeltaTableBuilder
for examples andDeltaColumnBuilder
detailed APIs.- Parameters:
spark
- sparkSession sparkSession passed by the usercolName
- string the column name- Since:
- 1.0.0
-
as
public DeltaTable as(String alias)
Apply an alias to the DeltaTable. This is similar toDataset.as(alias)
or SQLtableName AS alias
.- Parameters:
alias
- (undocumented)- Since:
- 0.3.0
-
alias
public DeltaTable alias(String alias)
Apply an alias to the DeltaTable. This is similar toDataset.as(alias)
or SQLtableName AS alias
.- Parameters:
alias
- (undocumented)- Since:
- 0.3.0
-
toDF
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()
Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.- Since:
- 0.3.0
-
vacuum
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> vacuum(double retentionHours)
Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.- Parameters:
retentionHours
- The retention threshold in hours. Files required by the table for reading versions earlier than this will be preserved and the rest of them will be deleted.- Since:
- 0.3.0
-
vacuum
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> vacuum()
Recursively delete files and directories in the table that are not needed by the table for maintaining older versions up to the given retention threshold. This method will return an empty DataFrame on successful completion.note: This will use the default retention period of 7 days.
- Since:
- 0.3.0
-
history
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> history(int limit)
Get the information of the latestlimit
commits on this table as a Spark DataFrame. The information is in reverse chronological order.- Parameters:
limit
- The number of previous commands to get history for- Since:
- 0.3.0
-
history
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> history()
Get the information available commits on this table as a Spark DataFrame. The information is in reverse chronological order.- Since:
- 0.3.0
-
detail
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> detail()
:: Evolving ::Get the details of a Delta table such as the format, name, and size.
- Since:
- 2.1.0
-
generate
public void generate(String mode)
Generate a manifest for the given Delta Table- Parameters:
mode
- Specifies the mode for the generation of the manifest. The valid modes are as follows (not case sensitive): - "symlink_format_manifest" : This will generate manifests in symlink format for Presto and Athena read support. See the online documentation for more information.- Since:
- 0.5.0
-
delete
public void delete(String condition)
Delete data from the table that match the givencondition
.- Parameters:
condition
- Boolean SQL expression- Since:
- 0.3.0
-
delete
public void delete(org.apache.spark.sql.Column condition)
Delete data from the table that match the givencondition
.- Parameters:
condition
- Boolean SQL expression- Since:
- 0.3.0
-
delete
public void delete()
Delete data from the table.- Since:
- 0.3.0
-
optimize
public DeltaOptimizeBuilder optimize()
Optimize the data layout of the table. This returns aDeltaOptimizeBuilder
object that can be used to specify the partition filter to limit the scope of optimize and also execute different optimization techniques such as file compaction or order data using Z-Order curves.See the
DeltaOptimizeBuilder
for a full description of this operation.Scala example to run file compaction on a subset of partitions in the table:
deltaTable .optimize() .where("date='2021-11-18'") .executeCompaction();
- Since:
- 2.0.0
-
update
public void update(scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined byset
.Scala example to increment the column
data
.import org.apache.spark.sql.functions._ deltaTable.update(Map("data" -> col("data") + 1))
- Parameters:
set
- rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.- Since:
- 0.3.0
-
update
public void update(java.util.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined byset
.Java example to increment the column
data
.import org.apache.spark.sql.Column; import org.apache.spark.sql.functions; deltaTable.update( new HashMap<String, Column>() {{ put("data", functions.col("data").plus(1)); }} );
- Parameters:
set
- rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.- Since:
- 0.3.0
-
update
public void update(org.apache.spark.sql.Column condition, scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the givencondition
based on the rules defined byset
.Scala example to increment the column
data
.import org.apache.spark.sql.functions._ deltaTable.update( col("date") > "2018-01-01", Map("data" -> col("data") + 1))
- Parameters:
condition
- boolean expression as Column object specifying which rows to update.set
- rules to update a row as a Scala map between target column names and corresponding update expressions as Column objects.- Since:
- 0.3.0
-
update
public void update(org.apache.spark.sql.Column condition, java.util.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the givencondition
based on the rules defined byset
.Java example to increment the column
data
.import org.apache.spark.sql.Column; import org.apache.spark.sql.functions; deltaTable.update( functions.col("date").gt("2018-01-01"), new HashMap<String, Column>() {{ put("data", functions.col("data").plus(1)); }} );
- Parameters:
condition
- boolean expression as Column object specifying which rows to update.set
- rules to update a row as a Java map between target column names and corresponding update expressions as Column objects.- Since:
- 0.3.0
-
updateExpr
public void updateExpr(scala.collection.immutable.Map<String,String> set)
Update rows in the table based on the rules defined byset
.Scala example to increment the column
data
.deltaTable.updateExpr(Map("data" -> "data + 1")))
- Parameters:
set
- rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.- Since:
- 0.3.0
-
updateExpr
public void updateExpr(java.util.Map<String,String> set)
Update rows in the table based on the rules defined byset
.Java example to increment the column
data
.deltaTable.updateExpr( new HashMap<String, String>() {{ put("data", "data + 1"); }} );
- Parameters:
set
- rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.- Since:
- 0.3.0
-
updateExpr
public void updateExpr(String condition, scala.collection.immutable.Map<String,String> set)
Update data from the table on the rows that match the givencondition
, which performs the rules defined byset
.Scala example to increment the column
data
.deltaTable.update( "date > '2018-01-01'", Map("data" -> "data + 1"))
- Parameters:
condition
- boolean expression as SQL formatted string object specifying which rows to update.set
- rules to update a row as a Scala map between target column names and corresponding update expressions as SQL formatted strings.- Since:
- 0.3.0
-
updateExpr
public void updateExpr(String condition, java.util.Map<String,String> set)
Update data from the table on the rows that match the givencondition
, which performs the rules defined byset
.Java example to increment the column
data
.deltaTable.update( "date > '2018-01-01'", new HashMap<String, String>() {{ put("data", "data + 1"); }} );
- Parameters:
condition
- boolean expression as SQL formatted string object specifying which rows to update.set
- rules to update a row as a Java map between target column names and corresponding update expressions as SQL formatted strings.- Since:
- 0.3.0
-
merge
public DeltaMergeBuilder merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, String condition)
Merge data from thesource
DataFrame based on the given mergecondition
. This returns aDeltaMergeBuilder
object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.See the
DeltaMergeBuilder
for a full description of this operation and what combinations of update, delete and insert operations are allowed.Scala example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable .as("target") .merge( source.as("source"), "target.key = source.key") .whenMatched .updateExpr(Map( "value" -> "source.value")) .whenNotMatched .insertExpr(Map( "key" -> "source.key", "value" -> "source.value")) .execute()
Java example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable .as("target") .merge( source.as("source"), "target.key = source.key") .whenMatched .updateExpr( new HashMap<String, String>() {{ put("value" -> "source.value"); }}) .whenNotMatched .insertExpr( new HashMap<String, String>() {{ put("key", "source.key"); put("value", "source.value"); }}) .execute();
- Parameters:
source
- source Dataframe to be merged.condition
- boolean expression as SQL formatted string- Since:
- 0.3.0
-
merge
public DeltaMergeBuilder merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column condition)
Merge data from thesource
DataFrame based on the given mergecondition
. This returns aDeltaMergeBuilder
object that can be used to specify the update, delete, or insert actions to be performed on rows based on whether the rows matched the condition or not.See the
DeltaMergeBuilder
for a full description of this operation and what combinations of update, delete and insert operations are allowed.Scala example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable .as("target") .merge( source.as("source"), "target.key = source.key") .whenMatched .updateExpr(Map( "value" -> "source.value")) .whenNotMatched .insertExpr(Map( "key" -> "source.key", "value" -> "source.value")) .execute()
Java example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable .as("target") .merge( source.as("source"), "target.key = source.key") .whenMatched .updateExpr( new HashMap<String, String>() {{ put("value" -> "source.value") }}) .whenNotMatched .insertExpr( new HashMap<String, String>() {{ put("key", "source.key"); put("value", "source.value"); }}) .execute()
- Parameters:
source
- source Dataframe to be merged.condition
- boolean expression as a Column object- Since:
- 0.3.0
-
restoreToVersion
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> restoreToVersion(long version)
Restore the DeltaTable to an older version of the table specified by version number.An example would be
@since 1.2.0io.delta.tables.DeltaTable.restoreToVersion(7)
- Parameters:
version
- (undocumented)
-
restoreToTimestamp
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> restoreToTimestamp(String timestamp)
Restore the DeltaTable to an older version of the table specified by a timestamp.Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss
An example would be
@since 1.2.0io.delta.tables.DeltaTable.restoreToTimestamp("2019-01-01")
- Parameters:
timestamp
- (undocumented)
-
upgradeTableProtocol
public void upgradeTableProtocol(int readerVersion, int writerVersion)
Updates the protocol version of the table to leverage new features. Upgrading the reader version will prevent all clients that have an older version of Delta Lake from accessing this table. Upgrading the writer version will prevent older versions of Delta Lake to write to this table. The reader or writer version cannot be downgraded.See online documentation and Delta's protocol specification at PROTOCOL.md for more details.
- Parameters:
readerVersion
- (undocumented)writerVersion
- (undocumented)- Since:
- 0.8.0
-
-