Interface TransactionBuilder


@Evolving public interface TransactionBuilder
Builder for creating a Transaction to mutate a Delta table.
Since:
3.2.0
  • Method Details

    • withSchema

      TransactionBuilder withSchema(Engine engine, StructType schema)
      Set the schema of the table. If setting the schema on an existing table for a schema evolution, then column mapping must be enabled. This API will preserve field metadata for fields such as field IDs and physical names. If field metadata is not specified for a field, it is considered as a new column and new IDs/physical names will be specified. The possible schema evolutions supported include column additions, removals, renames, and moves. If a schema evolution is performed, implementations must perform the following validations:
      • No duplicate columns are allowed
      • Column names contain only valid characters
      • Data types are supported
      • No new non-nullable fields are added
      • Physical column name consistency is preserved in the new schema
      • No type changes
      • ToDo: Nested IDs for array/map types are preserved in the new schema
      • ToDo: Validate invalid field reorderings
      Parameters:
      engine - Engine instance to use.
      schema - The new schema of the table.
      Returns:
      updated TransactionBuilder instance.
      Throws:
      KernelException - in case column mapping is not enabled
      IllegalArgumentException - in case of any validation failure
    • withPartitionColumns

      TransactionBuilder withPartitionColumns(Engine engine, List<String> partitionColumns)
      Set the list of partitions columns when create a new partitioned table.
      Parameters:
      engine - Engine instance to use.
      partitionColumns - The partition columns of the table. These should be a subset of the columns in the schema. Only top-level columns are allowed to be partitioned. Note: Clustering columns and partition columns cannot coexist in a table.
      Returns:
      updated TransactionBuilder instance.
    • withClusteringColumns

      TransactionBuilder withClusteringColumns(Engine engine, List<Column> clusteringColumns)
      Set the list of clustering columns when create a new clustered table.
      Parameters:
      engine - Engine instance to use.
      clusteringColumns - The clustering columns of the table. These should be a subset of the columns in the schema. Both top-level and nested columns are allowed to be clustered. Note: Clustering columns and partition columns cannot coexist in a table.
      Returns:
      updated TransactionBuilder instance.
    • withTransactionId

      TransactionBuilder withTransactionId(Engine engine, String applicationId, long transactionVersion)
      Set the transaction identifier for idempotent writes. Incremental processing systems (e.g., streaming systems) that track progress using their own application-specific versions need to record what progress has been made, in order to avoid duplicating data in the face of failures and retries during writes. By setting the transaction identifier, the Delta table can ensure that the data with same identifier is not written multiple times. For more information refer to the Delta protocol section Transaction Identifiers.
      Parameters:
      engine - Engine instance to use.
      applicationId - The application ID that is writing to the table.
      transactionVersion - The version of the transaction. This should be monotonically increasing with each write for the same application ID.
      Returns:
      updated TransactionBuilder instance.
    • withTableProperties

      TransactionBuilder withTableProperties(Engine engine, Map<String,String> properties)
      Set the table properties for the table. When the table already contains the property with same key, it gets replaced if it doesn't have the same value. Note, user-properties (those without a '.delta' prefix) are case-sensitive. Delta-properties are case-insensitive and are normalized to their expected case before writing to the log.
      Parameters:
      engine - Engine instance to use.
      properties - The table properties to set. These are key-value pairs that can be used to configure the table. And these properties are stored in the table metadata.
      Returns:
      updated TransactionBuilder instance.
      Since:
      3.3.0
    • withTablePropertiesRemoved

      TransactionBuilder withTablePropertiesRemoved(Set<String> propertyKeys)
      Unset the provided table properties on the table. If a property does not exist this is a no-op. For now this is only supported for user-properties (in other words, does not support 'delta.' prefixed properties). An exception will be thrown upon calling build(Engine) if the same key is both set and unset in the same transaction. Note, user-properties (those without a '.delta' prefix) are case-sensitive.
      Parameters:
      propertyKeys - the table property keys to unset (remove from the table properties)
      Returns:
      updated TransactionBuilder instance.
      Throws:
      IllegalArgumentException - if 'delta.' prefixed keys are provided
    • withMaxRetries

      TransactionBuilder withMaxRetries(int maxRetries)
      Set the maximum number of times to retry a transaction if a concurrent write is detected. This defaults to 200
      Parameters:
      maxRetries - The number of times to retry
      Returns:
      updated TransactionBuilder instance
    • withLogCompactionInverval

      TransactionBuilder withLogCompactionInverval(int logCompactionInterval)
      Set the number of commits between log compactions. Defaults to 0 (disabled). For more information see the Delta protocol section Log Compaction Files.
      Parameters:
      logCompactionInterval - The commits between log compactions
      Returns:
      updated TransactionBuilder instance
    • withDomainMetadataSupported

      TransactionBuilder withDomainMetadataSupported()
      Enables support for Domain Metadata on this table if it is not supported already. The table feature _must_ be supported on the table to add or remove domain metadata using Transaction.addDomainMetadata(java.lang.String, java.lang.String) or Transaction.removeDomainMetadata(java.lang.String). See How does Delta Lake manage feature compatibility? for more details on table feature support.

      See the Delta protocol for more information on how to use Domain Metadata. This may break existing writers that do not support the Domain Metadata feature; readers will be unaffected.

    • build

      Transaction build(Engine engine)
      Build the transaction. Also validates the given info to ensure that a valid transaction can be created.
      Parameters:
      engine - Engine instance to use.
      Throws:
      ConcurrentTransactionException - if the table already has a committed transaction with the same given transaction identifier.
      InvalidConfigurationValueException - if the value of the property is invalid.
      UnknownConfigurationException - if any of the properties are unknown to TableConfig.
      DomainDoesNotExistException - if removing a domain that does not exist in the latest version of the table
      TableAlreadyExistsException - if the operation provided when calling Table.createTransactionBuilder(Engine, String, Operation) is CREATE_TABLE and the table already exists