Additional Resources
Blog posts
- Scalable near real-time S3 access logging analytics with Apache Spark™ and Delta Lake from Zalando
- Brand Safety with Spark Streaming and Delta Lake from Eyeview
- Simple, Reliable Upserts and Deletes on Delta Lake Tables using Python APIs from Databricks
- Diving Into Delta Lake: Unpacking The Transaction Log from Databricks
- Diving Into Delta Lake: Schema Enforcement & Evolution from Databricks
- Introducing Delta Time Travel for Large Scale Data Lakes from Databricks
- Productionizing Machine Learning with Delta Lake from Databricks
- Simplifying Streaming Stock Analysis using Delta Lake and Apache Spark from Databricks
- Parallelizing SAIGE Across Hundreds of Cores from Databricks
Visit Databricks blogs for the latest posts on Delta Lake.
Talks
- Making Apache Spark™ Better with Delta Lake from Databricks
- Delta Architecture, A Step Beyond Lambda Architecture from Databricks
- Building Data Pipelines Using Structured Streaming and Delta Lake from Databricks
- Building Reliable Data Lakes at Scale with Delta Lake from Databricks
- This self-paced tutorial is hosted at the Delta Lake Github repository
- Near Real-Time Data Warehousing with Apache Spark and Delta Lake from Eventbrite
- Power Your Delta Lake with Streaming Transactional Changes from StreamSets
- Building an AI-Powered Retail Experience with Delta Lake, Spark, and Databricks from Zalando
- Driver Location Intelligence at Scale using Apache Spark, Delta Lake, and MLflow on Databricks from TomTom
Examples
The Delta Lake Github repository has Scala and Python examples
Delta Lake transaction log specification
The Delta Lake transaction log has a well-defined open protocol that can be used to read the log by any system. See Delta Transaction Log Protocol.