Apache Spark connector
Delta Lake 4.0 Preview is released! See the 4.0 Preview documentation here.
This is the documentation page for Delta Lake Spark connector.
- Quickstart
- Table batch reads and writes
- Create a table
- Read a table
- Query an older snapshot of a table (time travel)
- Write to a table
- Schema validation
- Update table schema
- Replace table schema
- Views on tables
- Table properties
- Syncing table schema and properties to the Hive metastore
- Table metadata
- Configure SparkSession
- Configure storage credentials
- Table streaming reads and writes
- Table deletes, updates, and merges
- Change data feed
- Table utility commands
- Remove files no longer referenced by a Delta table
- Retrieve Delta table history
- Retrieve Delta table details
- Generate a manifest file
- Convert a Parquet table to a Delta table
- Convert an Iceberg table to a Delta table
- Convert a Delta table to a Parquet table
- Restore a Delta table to an earlier state
- Shallow clone a Delta table
- Clone Parquet or Iceberg table to Delta
- Constraints
- How does Delta Lake manage feature compatibility?
- Delta default column values
- Delta column mapping
- Use liquid clustering for Delta tables
- What are deletion vectors?
- Drop Delta table features
- Use row tracking for Delta tables
- Storage configuration
- Delta type widening
- Universal Format (UniForm)
- Requirements
- Enable Delta Lake UniForm
- When does UniForm generate metadata?
- Check Iceberg/Hudi metadata generation status
- Read UniForm tables as Iceberg tables in Apache Spark
- Read UniForm tables as Iceberg tables using a metadata JSON path
- Read UniForm tables as Hudi tables in Apache Spark
- Delta and Iceberg/Hudi table versions
- Limitations
- Read Delta Sharing Tables
- Concurrency control
- Migration guide
- Best practices
- Frequently asked questions (FAQ)
- What is Delta Lake?
- How is Delta Lake related to Apache Spark?
- What format does Delta Lake use to store data?
- How can I read and write data with Delta Lake?
- Where does Delta Lake store the data?
- Can I copy my Delta Lake table to another location?
- Can I stream data directly into and from Delta tables?
- Does Delta Lake support writes or reads using the Spark Streaming DStream API?
- When I use Delta Lake, will I be able to port my code to other Spark platforms easily?
- Does Delta Lake support multi-table transactions?
- How can I change the type of a column?
- Optimizations