Databricks targets data pipeline automation with Delta Live Tables

Databricks has unveiled a new extract, transform, load (ETL) framework, dubbed Delta Live Tables, which is now generally available across the Microsoft Azure, AWS and Google Cloud platforms.

According to the data lake and warehouse provider, Delta Live Tables uses a simple declarative approach to building reliable data pipelines and automatically managing related infrastructure at scale, essentially reducing the time taken by data engineers and scientists on complex operational tasks.

“Table structures are common in databases and data management. Delta Live Tables are an upgrade for the multicloud Databricks platform that support the authoring, management and scheduling of pipelines in a more automated and less code-intensive way,” said Doug Henschen, principal analyst at Constellation Research.

By making authoring low-code and declarative through SQL-like statements, Databricks is looking to lower the barriers to entry for complex data work such as keeping ETL pipelines healthy.

“The bigger the company, the more likely it is to be struggling with all the code writing and technical challenges of building, maintaining and running myriad data pipelines,” Henschen said. “Delta Live Tables is aimed at easing and automating much of the coding, administrative and optimization work required to keep data pipelines flowing smoothly.”

Early days for the data lakehouse

However, Henschen warned that it is still early days for combined lake and warehouse platforms in enterprise environments. “We’re seeing more greenfield deployments and experiments for new use cases rather than straight up replacements of existing data lakes and data warehouses,” he said, adding that DLT has competition from the open source Apache Iceberg project.

Copyright © 2022 IDG Communications, Inc.

No comments:

Post a Comment