Merger to Improve Lakehouse Architecture Interoperability

0

Pioneering lakehouse architecture company Databricks will acquire data management company Tabular. The acquisition brings together the original creators of Apache Iceberg and Linux Foundation Delta Lake, the two leading open source lakehouse formats.

Databricks intends to work closely with the Delta Lake and Iceberg communities to bring format compatibility to the lakehouse: in the short term, inside Delta Lake UniForm and in the long term, by evolving towards a single, open, and common standard of interoperability.

“With Tabular joining Databricks, we intend to build the best data management platform based on open lakehouse formats so that companies don’t have to worry about picking the ‘right’ format or getting locked into proprietary data formats,” said Tabular Co-founder and CEO Ryan Blue.

Databricks pioneered the lakehouse architecture in 2020 to enable the integration of traditional data warehousing workloads with AI workloads on a single, governed copy of data. The foundation of the lakehouse is open source data formats that enable ACID transactions on data stored in object storage.

Around the same time Delta Lake was created, Ryan Blue and Daniel Weeks developed the Iceberg project at Netflix and donated it to the Apache Software Foundation. Since then, Delta Lake and Iceberg have emerged as the two leading open source standards for lakehouse formats.

Over time, a number of other open source and proprietary engines have adopted these formats. However, they usually adopted only one of the standards and more often than not, only part of that standard, leading to fragmented and siloed enterprise data, undermining the value of the lakehouse architecture.

Databricks says companies need data interoperability to realise the benefits of the lakehouse, and it will work closely with the Delta Lake and Iceberg communities to bring interoperability to the formats over time.

“Databricks pioneered the lakehouse and over the past four years, the world has embraced the lakehouse architecture,” said Databricks Co-founder and CEO Ali Ghodsi. “Unfortunately, the lakehouse paradigm has been split between the two most popular formats: Delta Lake and Iceberg. Databricks and Tabular will work with the open-source community to bring the two formats closer to each other over time, increasing openness, and reducing silos and friction for customers.”

Databricks is the world’s largest and most successful independent open source company by revenue and has donated 12 million lines of code to open source projects. Databricks says the Tabular acquisition highlights its commitment to open formats and open source data in the cloud.

The proposed acquisition is subject to customary closing conditions, and is expected to close in Databricks’ second fiscal quarter.

Share.

Comments are closed.