public class DeltaTable
extends Object
implements io.delta.tables.execution.DeltaTableOperations, scala.Serializable
DeltaTable.forPath(sparkSession, pathToTheDeltaTable)
Modifier and Type | Method and Description |
---|---|
void |
addFeatureSupport(String featureName)
Modify the protocol to add a supported feature, and if the table does not support table
features, upgrade the protocol automatically.
|
DeltaTable |
alias(String alias)
Apply an alias to the DeltaTable.
|
DeltaTable |
as(String alias)
Apply an alias to the DeltaTable.
|
DeltaTable |
clone(String target,
boolean isShallow)
Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.
|
DeltaTable |
clone(String target,
boolean isShallow,
boolean replace)
Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.
|
DeltaTable |
clone(String target,
boolean isShallow,
boolean replace,
scala.collection.immutable.Map<String,String> properties)
Clone a DeltaTable to a given destination to mirror the existing table's data and metadata.
|
DeltaTable |
cloneAtTimestamp(String timestamp,
String target,
boolean isShallow)
Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing
table's data and metadata at that timestamp.
|
DeltaTable |
cloneAtTimestamp(String timestamp,
String target,
boolean isShallow,
boolean replace)
Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing
table's data and metadata at that timestamp.
|
DeltaTable |
cloneAtTimestamp(String timestamp,
String target,
boolean isShallow,
boolean replace,
scala.collection.immutable.Map<String,String> properties)
Clone a DeltaTable at a specific timestamp to a given destination to mirror the existing
table's data and metadata at that timestamp.
|
DeltaTable |
cloneAtVersion(long version,
String target,
boolean isShallow)
Clone a DeltaTable at a specific version to a given destination to mirror the existing
table's data and metadata at that version.
|
DeltaTable |
cloneAtVersion(long version,
String target,
boolean isShallow,
boolean replace)
Clone a DeltaTable at a specific version to a given destination to mirror the existing
table's data and metadata at that version.
|
DeltaTable |
cloneAtVersion(long version,
String target,
boolean isShallow,
boolean replace,
scala.collection.immutable.Map<String,String> properties)
Clone a DeltaTable at a specific version to a given destination to mirror the existing
table's data and metadata at that version.
|
static DeltaColumnBuilder |
columnBuilder(org.apache.spark.sql.SparkSession spark,
String colName)
:: Evolving ::
|
static DeltaColumnBuilder |
columnBuilder(String colName)
:: Evolving ::
|
static DeltaTable |
convertToDelta(org.apache.spark.sql.SparkSession spark,
String identifier)
Create a DeltaTable from the given parquet table.
|
static DeltaTable |
convertToDelta(org.apache.spark.sql.SparkSession spark,
String identifier,
String partitionSchema)
Create a DeltaTable from the given parquet table and partition schema.
|
static DeltaTable |
convertToDelta(org.apache.spark.sql.SparkSession spark,
String identifier,
org.apache.spark.sql.types.StructType partitionSchema)
Create a DeltaTable from the given parquet table and partition schema.
|
static DeltaTableBuilder |
create()
:: Evolving ::
|
static DeltaTableBuilder |
create(org.apache.spark.sql.SparkSession spark)
:: Evolving ::
|
static DeltaTableBuilder |
createIfNotExists()
:: Evolving ::
|
static DeltaTableBuilder |
createIfNotExists(org.apache.spark.sql.SparkSession spark)
:: Evolving ::
|
static DeltaTableBuilder |
createOrReplace()
:: Evolving ::
|
static DeltaTableBuilder |
createOrReplace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::
|
void |
delete()
Delete data from the table.
|
void |
delete(org.apache.spark.sql.Column condition)
Delete data from the table that match the given
condition . |
void |
delete(String condition)
Delete data from the table that match the given
condition . |
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
detail()
:: Evolving ::
|
static DeltaTable |
forName(org.apache.spark.sql.SparkSession sparkSession,
String tableName)
Instantiate a
DeltaTable object using the given table name using the given
SparkSession. |
static DeltaTable |
forName(String tableOrViewName)
Instantiate a
DeltaTable object using the given table name. |
static DeltaTable |
forPath(org.apache.spark.sql.SparkSession sparkSession,
String path)
Instantiate a
DeltaTable object representing the data at the given path, If the given
path is invalid (i.e. |
static DeltaTable |
forPath(org.apache.spark.sql.SparkSession sparkSession,
String path,
scala.collection.Map<String,String> hadoopConf)
Instantiate a
DeltaTable object representing the data at the given path, If the given
path is invalid (i.e. |
static DeltaTable |
forPath(org.apache.spark.sql.SparkSession sparkSession,
String path,
java.util.Map<String,String> hadoopConf)
Java friendly API to instantiate a
DeltaTable object representing the data at the given
path, If the given path is invalid (i.e. |
static DeltaTable |
forPath(String path)
Instantiate a
DeltaTable object representing the data at the given path, If the given
path is invalid (i.e. |
void |
generate(String mode)
Generate a manifest for the given Delta Table
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
history()
Get the information available commits on this table as a Spark DataFrame.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
history(int limit)
Get the information of the latest
limit commits on this table as a Spark DataFrame. |
static boolean |
isDeltaTable(org.apache.spark.sql.SparkSession sparkSession,
String identifier)
Check if the provided
identifier string, in this case a file path,
is the root of a Delta table using the given SparkSession. |
static boolean |
isDeltaTable(String identifier)
Check if the provided
identifier string, in this case a file path,
is the root of a Delta table. |
DeltaMergeBuilder |
merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source,
org.apache.spark.sql.Column condition)
Merge data from the
source DataFrame based on the given merge condition . |
DeltaMergeBuilder |
merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source,
String condition)
Merge data from the
source DataFrame based on the given merge condition . |
DeltaOptimizeBuilder |
optimize()
Optimize the data layout of the table.
|
static DeltaTableBuilder |
replace()
:: Evolving ::
|
static DeltaTableBuilder |
replace(org.apache.spark.sql.SparkSession spark)
:: Evolving ::
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
restoreToTimestamp(String timestamp)
Restore the DeltaTable to an older version of the table specified by a timestamp.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
restoreToVersion(long version)
Restore the DeltaTable to an older version of the table specified by version number.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
toDF()
Get a DataFrame (that is, Dataset[Row]) representation of this Delta table.
|
void |
update(org.apache.spark.sql.Column condition,
scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the given
condition
based on the rules defined by set . |
void |
update(org.apache.spark.sql.Column condition,
java.util.Map<String,org.apache.spark.sql.Column> set)
Update data from the table on the rows that match the given
condition
based on the rules defined by set . |
void |
update(scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined by
set . |
void |
update(java.util.Map<String,org.apache.spark.sql.Column> set)
Update rows in the table based on the rules defined by
set . |
void |
updateExpr(scala.collection.immutable.Map<String,String> set)
Update rows in the table based on the rules defined by
set . |
void |
updateExpr(java.util.Map<String,String> set)
Update rows in the table based on the rules defined by
set . |
void |
updateExpr(String condition,
scala.collection.immutable.Map<String,String> set)
Update data from the table on the rows that match the given
condition ,
which performs the rules defined by set . |
void |
updateExpr(String condition,
java.util.Map<String,String> set)
Update data from the table on the rows that match the given
condition ,
which performs the rules defined by set . |
void |
upgradeTableProtocol(int readerVersion,
int writerVersion)
Updates the protocol version of the table to leverage new features.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
vacuum()
Recursively delete files and directories in the table that are not needed by the table for
maintaining older versions up to the given retention threshold.
|
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
vacuum(double retentionHours)
Recursively delete files and directories in the table that are not needed by the table for
maintaining older versions up to the given retention threshold.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
$init$, executeClone, executeClone$default$6, executeClone$default$7, executeDelete, executeDetails, executeGenerate, executeHistory, executeHistory$default$2, executeHistory$default$3, executeRestore, executeUpdate, executeVacuum, executeVacuum$default$3, sparkSession, toStrColumnMap
public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, org.apache.spark.sql.types.StructType partitionSchema)
Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An example usage would be
io.delta.tables.DeltaTable.convertToDelta(
spark,
"parquet.`/path`",
new StructType().add(StructField("key1", LongType)).add(StructField("key2", StringType)))
spark
- (undocumented)identifier
- (undocumented)partitionSchema
- (undocumented)public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier, String partitionSchema)
Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An example usage would be
io.delta.tables.DeltaTable.convertToDelta(
spark,
"parquet.`/path`",
"key1 long, key2 string")
spark
- (undocumented)identifier
- (undocumented)partitionSchema
- (undocumented)public static DeltaTable convertToDelta(org.apache.spark.sql.SparkSession spark, String identifier)
Note: Any changes to the table during the conversion process may not result in a consistent state at the end of the conversion. Users should stop any changes to the table before the conversion is started.
An Example would be
io.delta.tables.DeltaTable.convertToDelta(
spark,
"parquet.`/path`"
spark
- (undocumented)identifier
- (undocumented)public static DeltaTable forPath(String path)
DeltaTable
object representing the data at the given path, If the given
path is invalid (i.e. either no table exists or an existing table is not a Delta table),
it throws a not a Delta table
error.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
path
- (undocumented)public static DeltaTable forPath(org.apache.spark.sql.SparkSession sparkSession, String path)
DeltaTable
object representing the data at the given path, If the given
path is invalid (i.e. either no table exists or an existing table is not a Delta table),
it throws a not a Delta table
error.
sparkSession
- (undocumented)path
- (undocumented)public static DeltaTable forPath(org.apache.spark.sql.SparkSession sparkSession, String path, scala.collection.Map<String,String> hadoopConf)
DeltaTable
object representing the data at the given path, If the given
path is invalid (i.e. either no table exists or an existing table is not a Delta table),
it throws a not a Delta table
error.
hadoopConf
- Hadoop configuration starting with "fs." or "dfs." will be picked up
by DeltaTable
to access the file system when executing queries.
Other configurations will not be allowed.
val hadoopConf = Map(
"fs.s3a.access.key" -> "<access-key>",
"fs.s3a.secret.key" -> "<secret-key>"
)
DeltaTable.forPath(spark, "/path/to/table", hadoopConf)
sparkSession
- (undocumented)path
- (undocumented)public static DeltaTable forPath(org.apache.spark.sql.SparkSession sparkSession, String path, java.util.Map<String,String> hadoopConf)
DeltaTable
object representing the data at the given
path, If the given path is invalid (i.e. either no table exists or an existing table is not a
Delta table), it throws a not a Delta table
error.
hadoopConf
- Hadoop configuration starting with "fs." or "dfs." will be picked up
by DeltaTable
to access the file system when executing queries.
Other configurations will be ignored.
val hadoopConf = Map(
"fs.s3a.access.key" -> "<access-key>",
"fs.s3a.secret.key", "<secret-key>"
)
DeltaTable.forPath(spark, "/path/to/table", hadoopConf)
sparkSession
- (undocumented)path
- (undocumented)public static DeltaTable forName(String tableOrViewName)
DeltaTable
object using the given table name. If the given
tableOrViewName is invalid (i.e. either no table exists or an existing table is not a
Delta table), it throws a not a Delta table
error. Note: Passing a view name will also
result in this error as views are not supported.
The given tableOrViewName can also be the absolute path of a delta datasource (i.e.
delta.path
), If so, instantiate a DeltaTable
object representing the data at
the given path (consistent with the forPath
).
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
tableOrViewName
- (undocumented)public static DeltaTable forName(org.apache.spark.sql.SparkSession sparkSession, String tableName)
DeltaTable
object using the given table name using the given
SparkSession. If the given tableName is invalid (i.e. either no table exists or an
existing table is not a Delta table), it throws a not a Delta table
error. Note:
Passing a view name will also result in this error as views are not supported.
The given tableName can also be the absolute path of a delta datasource (i.e.
delta.path
), If so, instantiate a DeltaTable
object representing the data at
the given path (consistent with the forPath
).
sparkSession
- (undocumented)tableName
- (undocumented)public static boolean isDeltaTable(org.apache.spark.sql.SparkSession sparkSession, String identifier)
identifier
string, in this case a file path,
is the root of a Delta table using the given SparkSession.
An example would be
DeltaTable.isDeltaTable(spark, "path/to/table")
sparkSession
- (undocumented)identifier
- (undocumented)public static boolean isDeltaTable(String identifier)
identifier
string, in this case a file path,
is the root of a Delta table.
Note: This uses the active SparkSession in the current thread to search for the table. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
An example would be
DeltaTable.isDeltaTable(spark, "/path/to/table")
identifier
- (undocumented)public static DeltaTableBuilder create()
Return an instance of DeltaTableBuilder
to create a Delta table,
error if the table exists (the same as SQL CREATE TABLE
).
Refer to DeltaTableBuilder
for more details.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
public static DeltaTableBuilder create(org.apache.spark.sql.SparkSession spark)
Return an instance of DeltaTableBuilder
to create a Delta table,
error if the table exists (the same as SQL CREATE TABLE
).
Refer to DeltaTableBuilder
for more details.
spark
- sparkSession sparkSession passed by the userpublic static DeltaTableBuilder createIfNotExists()
Return an instance of DeltaTableBuilder
to create a Delta table,
if it does not exists (the same as SQL CREATE TABLE IF NOT EXISTS
).
Refer to DeltaTableBuilder
for more details.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
public static DeltaTableBuilder createIfNotExists(org.apache.spark.sql.SparkSession spark)
Return an instance of DeltaTableBuilder
to create a Delta table,
if it does not exists (the same as SQL CREATE TABLE IF NOT EXISTS
).
Refer to DeltaTableBuilder
for more details.
spark
- sparkSession sparkSession passed by the userpublic static DeltaTableBuilder replace()
Return an instance of DeltaTableBuilder
to replace a Delta table,
error if the table doesn't exist (the same as SQL REPLACE TABLE
)
Refer to DeltaTableBuilder
for more details.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
public static DeltaTableBuilder replace(org.apache.spark.sql.SparkSession spark)
Return an instance of DeltaTableBuilder
to replace a Delta table,
error if the table doesn't exist (the same as SQL REPLACE TABLE
)
Refer to DeltaTableBuilder
for more details.
spark
- sparkSession sparkSession passed by the userpublic static DeltaTableBuilder createOrReplace()
Return an instance of DeltaTableBuilder
to replace a Delta table
or create table if not exists (the same as SQL CREATE OR REPLACE TABLE
)
Refer to DeltaTableBuilder
for more details.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
public static DeltaTableBuilder createOrReplace(org.apache.spark.sql.SparkSession spark)
Return an instance of DeltaTableBuilder
to replace a Delta table,
or create table if not exists (the same as SQL CREATE OR REPLACE TABLE
)
Refer to DeltaTableBuilder
for more details.
spark
- sparkSession sparkSession passed by the user.public static DeltaColumnBuilder columnBuilder(String colName)
Return an instance of DeltaColumnBuilder
to specify a column.
Refer to DeltaTableBuilder
for examples and DeltaColumnBuilder
detailed APIs.
Note: This uses the active SparkSession in the current thread to read the table data. Hence,
this throws error if active SparkSession has not been set, that is,
SparkSession.getActiveSession()
is empty.
colName
- string the column namepublic static DeltaColumnBuilder columnBuilder(org.apache.spark.sql.SparkSession spark, String colName)
Return an instance of DeltaColumnBuilder
to specify a column.
Refer to DeltaTableBuilder
for examples and DeltaColumnBuilder
detailed APIs.
spark
- sparkSession sparkSession passed by the usercolName
- string the column namepublic DeltaTable as(String alias)
Dataset.as(alias)
or
SQL tableName AS alias
.
alias
- (undocumented)public DeltaTable alias(String alias)
Dataset.as(alias)
or
SQL tableName AS alias
.
alias
- (undocumented)public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> vacuum(double retentionHours)
retentionHours
- The retention threshold in hours. Files required by the table for
reading versions earlier than this will be preserved and the
rest of them will be deleted.public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> vacuum()
note: This will use the default retention period of 7 days.
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> history(int limit)
limit
commits on this table as a Spark DataFrame.
The information is in reverse chronological order.
limit
- The number of previous commands to get history for
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> history()
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> detail()
Get the details of a Delta table such as the format, name, and size.
public void generate(String mode)
mode
- Specifies the mode for the generation of the manifest.
The valid modes are as follows (not case sensitive):
- "symlink_format_manifest" : This will generate manifests in symlink format
for Presto and Athena read support.
See the online documentation for more information.public void delete(String condition)
condition
.
condition
- Boolean SQL expression
public void delete(org.apache.spark.sql.Column condition)
condition
.
condition
- Boolean SQL expression
public void delete()
public DeltaOptimizeBuilder optimize()
DeltaOptimizeBuilder
object that can be used to specify
the partition filter to limit the scope of optimize and
also execute different optimization techniques such as file
compaction or order data using Z-Order curves.
See the DeltaOptimizeBuilder
for a full description
of this operation.
Scala example to run file compaction on a subset of partitions in the table:
deltaTable
.optimize()
.where("date='2021-11-18'")
.executeCompaction();
public void update(scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
set
.
Scala example to increment the column data
.
import org.apache.spark.sql.functions._
deltaTable.update(Map("data" -> col("data") + 1))
set
- rules to update a row as a Scala map between target column names and
corresponding update expressions as Column objects.public void update(java.util.Map<String,org.apache.spark.sql.Column> set)
set
.
Java example to increment the column data
.
import org.apache.spark.sql.Column;
import org.apache.spark.sql.functions;
deltaTable.update(
new HashMap<String, Column>() {{
put("data", functions.col("data").plus(1));
}}
);
set
- rules to update a row as a Java map between target column names and
corresponding update expressions as Column objects.public void update(org.apache.spark.sql.Column condition, scala.collection.immutable.Map<String,org.apache.spark.sql.Column> set)
condition
based on the rules defined by set
.
Scala example to increment the column data
.
import org.apache.spark.sql.functions._
deltaTable.update(
col("date") > "2018-01-01",
Map("data" -> col("data") + 1))
condition
- boolean expression as Column object specifying which rows to update.set
- rules to update a row as a Scala map between target column names and
corresponding update expressions as Column objects.public void update(org.apache.spark.sql.Column condition, java.util.Map<String,org.apache.spark.sql.Column> set)
condition
based on the rules defined by set
.
Java example to increment the column data
.
import org.apache.spark.sql.Column;
import org.apache.spark.sql.functions;
deltaTable.update(
functions.col("date").gt("2018-01-01"),
new HashMap<String, Column>() {{
put("data", functions.col("data").plus(1));
}}
);
condition
- boolean expression as Column object specifying which rows to update.set
- rules to update a row as a Java map between target column names and
corresponding update expressions as Column objects.public void updateExpr(scala.collection.immutable.Map<String,String> set)
set
.
Scala example to increment the column data
.
deltaTable.updateExpr(Map("data" -> "data + 1")))
set
- rules to update a row as a Scala map between target column names and
corresponding update expressions as SQL formatted strings.public void updateExpr(java.util.Map<String,String> set)
set
.
Java example to increment the column data
.
deltaTable.updateExpr(
new HashMap<String, String>() {{
put("data", "data + 1");
}}
);
set
- rules to update a row as a Java map between target column names and
corresponding update expressions as SQL formatted strings.public void updateExpr(String condition, scala.collection.immutable.Map<String,String> set)
condition
,
which performs the rules defined by set
.
Scala example to increment the column data
.
deltaTable.update(
"date > '2018-01-01'",
Map("data" -> "data + 1"))
condition
- boolean expression as SQL formatted string object specifying
which rows to update.set
- rules to update a row as a Scala map between target column names and
corresponding update expressions as SQL formatted strings.public void updateExpr(String condition, java.util.Map<String,String> set)
condition
,
which performs the rules defined by set
.
Java example to increment the column data
.
deltaTable.update(
"date > '2018-01-01'",
new HashMap<String, String>() {{
put("data", "data + 1");
}}
);
condition
- boolean expression as SQL formatted string object specifying
which rows to update.set
- rules to update a row as a Java map between target column names and
corresponding update expressions as SQL formatted strings.public DeltaMergeBuilder merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, String condition)
source
DataFrame based on the given merge condition
. This returns
a DeltaMergeBuilder
object that can be used to specify the update, delete, or insert
actions to be performed on rows based on whether the rows matched the condition or not.
See the DeltaMergeBuilder
for a full description of this operation and what combinations of
update, delete and insert operations are allowed.
Scala example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.whenMatched
.updateExpr(Map(
"value" -> "source.value"))
.whenNotMatched
.insertExpr(Map(
"key" -> "source.key",
"value" -> "source.value"))
.execute()
Java example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.whenMatched
.updateExpr(
new HashMap<String, String>() {{
put("value" -> "source.value");
}})
.whenNotMatched
.insertExpr(
new HashMap<String, String>() {{
put("key", "source.key");
put("value", "source.value");
}})
.execute();
source
- source Dataframe to be merged.condition
- boolean expression as SQL formatted stringpublic DeltaMergeBuilder merge(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column condition)
source
DataFrame based on the given merge condition
. This returns
a DeltaMergeBuilder
object that can be used to specify the update, delete, or insert
actions to be performed on rows based on whether the rows matched the condition or not.
See the DeltaMergeBuilder
for a full description of this operation and what combinations of
update, delete and insert operations are allowed.
Scala example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.whenMatched
.updateExpr(Map(
"value" -> "source.value"))
.whenNotMatched
.insertExpr(Map(
"key" -> "source.key",
"value" -> "source.value"))
.execute()
Java example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.whenMatched
.updateExpr(
new HashMap<String, String>() {{
put("value" -> "source.value")
}})
.whenNotMatched
.insertExpr(
new HashMap<String, String>() {{
put("key", "source.key");
put("value", "source.value");
}})
.execute()
source
- source Dataframe to be merged.condition
- boolean expression as a Column objectpublic org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> restoreToVersion(long version)
An example would be
io.delta.tables.DeltaTable.restoreToVersion(7)
@since 1.2.0version
- (undocumented)public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> restoreToTimestamp(String timestamp)
Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss
An example would be
io.delta.tables.DeltaTable.restoreToTimestamp("2019-01-01")
@since 1.2.0timestamp
- (undocumented)public void upgradeTableProtocol(int readerVersion, int writerVersion)
See online documentation and Delta's protocol specification at PROTOCOL.md for more details.
readerVersion
- (undocumented)writerVersion
- (undocumented)public void addFeatureSupport(String featureName)
7
, and when the provided
feature is reader-writer, both reader and writer versions will be upgraded, to (3, 7)
.
See online documentation and Delta's protocol specification at PROTOCOL.md for more details.
featureName
- (undocumented)public DeltaTable clone(String target, boolean isShallow, boolean replace, scala.collection.immutable.Map<String,String> properties)
Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.
An example would be
io.delta.tables.DeltaTable.clone(
"/some/path/to/table",
true,
true,
Map("foo" -> "bar"))
target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.properties
- The table properties to override in the clone.
public DeltaTable clone(String target, boolean isShallow, boolean replace)
An example would be
io.delta.tables.DeltaTable.clone(
"/some/path/to/table",
true,
true)
target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.
public DeltaTable clone(String target, boolean isShallow)
An example would be
io.delta.tables.DeltaTable.clone(
"/some/path/to/table",
true)
target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.
public DeltaTable cloneAtVersion(long version, String target, boolean isShallow, boolean replace, scala.collection.immutable.Map<String,String> properties)
Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.
An example would be
io.delta.tables.DeltaTable.cloneAtVersion(
5,
"/some/path/to/table",
true,
true,
Map("foo" -> "bar"))
version
- The version of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.properties
- The table properties to override in the clone.
public DeltaTable cloneAtVersion(long version, String target, boolean isShallow, boolean replace)
An example would be
io.delta.tables.DeltaTable.cloneAtVersion(
5,
"/some/path/to/table",
true,
true)
version
- The version of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.
public DeltaTable cloneAtVersion(long version, String target, boolean isShallow)
An example would be
io.delta.tables.DeltaTable.cloneAtVersion(
5,
"/some/path/to/table",
true)
version
- The version of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.
public DeltaTable cloneAtTimestamp(String timestamp, String target, boolean isShallow, boolean replace, scala.collection.immutable.Map<String,String> properties)
Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
Specifying properties here means that the target will override any properties with the same key in the source table with the user-defined properties.
An example would be
io.delta.tables.DeltaTable.cloneAtTimestamp(
"2019-01-01",
"/some/path/to/table",
true,
true,
Map("foo" -> "bar"))
timestamp
- The timestamp of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.properties
- The table properties to override in the clone.
public DeltaTable cloneAtTimestamp(String timestamp, String target, boolean isShallow, boolean replace)
Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
An example would be
io.delta.tables.DeltaTable.cloneAtTimestamp(
"2019-01-01",
"/some/path/to/table",
true,
true)
timestamp
- The timestamp of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.replace
- Whether to replace the destination with the clone command.
public DeltaTable cloneAtTimestamp(String timestamp, String target, boolean isShallow)
Timestamp can be of the format yyyy-MM-dd or yyyy-MM-dd HH:mm:ss.
An example would be
io.delta.tables.DeltaTable.cloneAtTimestamp(
"2019-01-01",
"/some/path/to/table",
true)
timestamp
- The timestamp of this table to clone from.target
- The path or table name to create the clone.isShallow
- Whether to create a shallow clone or a deep clone.