public class DeltaMergeBuilder
extends Object
implements org.apache.spark.sql.delta.util.AnalysisHelper, org.apache.spark.internal.Logging
whenMatched
and whenNotMatched
clauses.
Here are the constraints on these clauses.
- whenMatched
clauses:
- The condition in a whenMatched
clause is optional. However, if there are multiple
whenMatched
clauses, then only the last one may omit the condition.
- When there are more than one whenMatched
clauses and there are conditions (or the lack
of) such that a row satisfies multiple clauses, then the action for the first clause
satisfied is executed. In other words, the order of the whenMatched
clauses matters.
- If none of the whenMatched
clauses match a source-target row pair that satisfy
the merge condition, then the target rows will not be updated or deleted.
- If you want to update all the columns of the target Delta table with the
corresponding column of the source DataFrame, then you can use the
whenMatched(...).updateAll()
. This is equivalent to
whenMatched(...).updateExpr(Map( ("col1", "source.col1"), ("col2", "source.col2"), ...))
- whenNotMatched
clauses:
- The condition in a whenNotMatched
clause is optional. However, if there are
multiple whenNotMatched
clauses, then only the last one may omit the condition.
- When there are more than one whenNotMatched
clauses and there are conditions (or the
lack of) such that a row satisfies multiple clauses, then the action for the first clause
satisfied is executed. In other words, the order of the whenNotMatched
clauses matters.
- If no whenNotMatched
clause is present or if it is present but the non-matching source
row does not satisfy the condition, then the source row is not inserted.
- If you want to insert all the columns of the target Delta table with the
corresponding column of the source DataFrame, then you can use
whenNotMatched(...).insertAll()
. This is equivalent to
whenNotMatched(...).insertExpr(Map( ("col1", "source.col1"), ("col2", "source.col2"), ...))
- whenNotMatchedBySource
clauses:
- The condition in a whenNotMatchedBySource
clause is optional. However, if there are
multiple whenNotMatchedBySource
clauses, then only the last one may omit the condition.
- When there are more than one whenNotMatchedBySource
clauses and there are conditions (or
the lack of) such that a row satisfies multiple clauses, then the action for the first
clause satisfied is executed. In other words, the order of the whenNotMatchedBySource
clauses matters.
- If no whenNotMatchedBySource
clause is present or if it is present but the
non-matching target row does not satisfy any of the whenNotMatchedBySource
clause
condition, then the target row will not be updated or deleted.
Scala example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.withSchemaEvolution()
.whenMatched()
.updateExpr(Map(
"value" -> "source.value"))
.whenNotMatched()
.insertExpr(Map(
"key" -> "source.key",
"value" -> "source.value"))
.whenNotMatchedBySource()
.updateExpr(Map(
"value" -> "target.value + 1"))
.execute()
Java example to update a key-value Delta table with new key-values from a source DataFrame:
deltaTable
.as("target")
.merge(
source.as("source"),
"target.key = source.key")
.withSchemaEvolution()
.whenMatched()
.updateExpr(
new HashMap<String, String>() {{
put("value", "source.value");
}})
.whenNotMatched()
.insertExpr(
new HashMap<String, String>() {{
put("key", "source.key");
put("value", "source.value");
}})
.whenNotMatchedBySource()
.updateExpr(
new HashMap<String, String>() {{
put("value", "target.value + 1");
}})
.execute();
Constructor and Description |
---|
DeltaMergeBuilder(DeltaTable targetTable,
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source,
org.apache.spark.sql.Column onCondition,
scala.collection.Seq<org.apache.spark.sql.catalyst.plans.logical.DeltaMergeIntoClause> whenClauses) |
Modifier and Type | Method and Description |
---|---|
void |
execute()
Execute the merge operation based on the built matched and not matched actions.
|
DeltaMergeMatchedActionBuilder |
whenMatched()
Build the actions to perform when the merge condition was matched.
|
DeltaMergeMatchedActionBuilder |
whenMatched(org.apache.spark.sql.Column condition)
Build the actions to perform when the merge condition was matched and
the given
condition is true. |
DeltaMergeMatchedActionBuilder |
whenMatched(String condition)
Build the actions to perform when the merge condition was matched and
the given
condition is true. |
DeltaMergeNotMatchedActionBuilder |
whenNotMatched()
Build the action to perform when the merge condition was not matched.
|
DeltaMergeNotMatchedActionBuilder |
whenNotMatched(org.apache.spark.sql.Column condition)
Build the actions to perform when the merge condition was not matched and
the given
condition is true. |
DeltaMergeNotMatchedActionBuilder |
whenNotMatched(String condition)
Build the actions to perform when the merge condition was not matched and
the given
condition is true. |
DeltaMergeNotMatchedBySourceActionBuilder |
whenNotMatchedBySource()
Build the actions to perform when the merge condition was not matched by the source.
|
DeltaMergeNotMatchedBySourceActionBuilder |
whenNotMatchedBySource(org.apache.spark.sql.Column condition)
Build the actions to perform when the merge condition was not matched by the source and the
given
condition is true. |
DeltaMergeNotMatchedBySourceActionBuilder |
whenNotMatchedBySource(String condition)
Build the actions to perform when the merge condition was not matched by the source and the
given
condition is true. |
DeltaMergeBuilder |
withSchemaEvolution()
Enable schema evolution for the merge operation.
|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
$init$, improveUnsupportedOpError, resolveReferencesForExpressions, toDataset, tryResolveReferences, tryResolveReferencesForExpressions, tryResolveReferencesForExpressions
$init$, initializeForcefully, initializeLogIfNecessary, initializeLogIfNecessary, initializeLogIfNecessary$default$2, initLock, isTraceEnabled, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning, org$apache$spark$internal$Logging$$log__$eq, org$apache$spark$internal$Logging$$log_, uninitialize
public DeltaMergeBuilder(DeltaTable targetTable, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> source, org.apache.spark.sql.Column onCondition, scala.collection.Seq<org.apache.spark.sql.catalyst.plans.logical.DeltaMergeIntoClause> whenClauses)
public DeltaMergeMatchedActionBuilder whenMatched()
DeltaMergeMatchedActionBuilder
object which can be used to specify how
to update or delete the matched target table row with the source row.public DeltaMergeMatchedActionBuilder whenMatched(String condition)
condition
is true. This returns DeltaMergeMatchedActionBuilder
object
which can be used to specify how to update or delete the matched target table row with the
source row.
condition
- boolean expression as a SQL formatted stringpublic DeltaMergeMatchedActionBuilder whenMatched(org.apache.spark.sql.Column condition)
condition
is true. This returns a DeltaMergeMatchedActionBuilder
object
which can be used to specify how to update or delete the matched target table row with the
source row.
condition
- boolean expression as a Column objectpublic DeltaMergeNotMatchedActionBuilder whenNotMatched()
DeltaMergeNotMatchedActionBuilder
object which can be used to specify how
to insert the new sourced row into the target table.public DeltaMergeNotMatchedActionBuilder whenNotMatched(String condition)
condition
is true. This returns DeltaMergeMatchedActionBuilder
object
which can be used to specify how to insert the new sourced row into the target table.
condition
- boolean expression as a SQL formatted stringpublic DeltaMergeNotMatchedActionBuilder whenNotMatched(org.apache.spark.sql.Column condition)
condition
is true. This returns DeltaMergeMatchedActionBuilder
object
which can be used to specify how to insert the new sourced row into the target table.
condition
- boolean expression as a Column objectpublic DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource()
DeltaMergeNotMatchedBySourceActionBuilder
object which can be used to specify how
to update or delete the target table row.public DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource(String condition)
condition
is true. This returns DeltaMergeNotMatchedBySourceActionBuilder
object
which can be used to specify how to update or delete the target table row.
condition
- boolean expression as a SQL formatted stringpublic DeltaMergeNotMatchedBySourceActionBuilder whenNotMatchedBySource(org.apache.spark.sql.Column condition)
condition
is true. This returns DeltaMergeNotMatchedBySourceActionBuilder
object
which can be used to specify how to update or delete the target table row .
condition
- boolean expression as a Column objectpublic DeltaMergeBuilder withSchemaEvolution()
public void execute()