Delta Lake is an open source storage layer that brings reliability to data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. Delta Lake runs on top of your existing data lake and is fully compatible with Apache Spark APIs.
Delta Lake uses versioned Parquet files to store your data in your cloud storage. Apart from the versions, Delta Lake also stores a transaction log to keep track of all the commits made to the table or blob store directory to provide ACID transactions.
When writing data, you can specify the location in your cloud storage. Delta Lake stores the data in that location in Parquet format.
Delta does not support the DStream API. We recommend Table Streaming Reads and Writes.
Yes. When you use Delta Lake, you are using open Apache Spark APIs so you can easily port your code to other Spark platforms. To port your code, replace
delta format with