We're excited to announce that Delta Lake on Azure Data Lake Storage (ADLS) is now generally available.
Delta Lake is the preferred format for data on both Microsoft and Databricks, and Fivetran’s Managed Data Lake Service automates the process of extracting your data, converting it to Delta Lake format and loading it into Azure Data Lake Storage. So whether you're already using Azure, Databricks or both, skip the pain of data movement and start leveraging advanced analytics and AI faster. Additionally, Fivetran eliminates ingestion costs into the data lake, making it even more cost-effective for your business.
Why is Fivetran integrating Delta Lake files into ADLS a big deal?
Keeping a data pipeline running is harder than most people imagine. Change data capture (CDC) can become a major headache, as evolving schemas and API updates can break pipelines at inopportune times.
Fivetran's ability to load data in a Delta Lake format on ADLS destination is the simplest way to unify all of your data in one place — the lakehouse. We've spent years building and refining our over 650 connectors and have hundreds of engineers whose sole responsibility is maintaining them to ensure an SLA of 99.9 percent uptime. Our Managed Data Lake Service adds even more value by providing automated data integration, standardization on open table formats, continuous maintenance and robust governance tools. In addition, Fivetran automates data lake metadata management for better discoverability and governance compliance, populating metadata into data catalogs such as Polaris Catalog, Unity Catalog and AWS Glue — creating a governed data foundation upon which you can build analytics and AI use cases.
Delta Lake on Azure is especially a big deal because it natively integrates with Databricks Unity Catalog. Data catalogs allow you to govern your data and track its lineage, enabling full visibility into how your teams are using different data sets to build downstream tables and applications.
How does Fivetran's new Delta Lake on Azure destination work?
With Fivetran, you can seamlessly load data from over 650 sources, including on-premises and cloud data warehouses, databases and SaaS applications. Regardless of where your data resides, you can effortlessly copy it into your lakehouse for advanced analytics and AI in no time.
But we don't stop at basic data replication. We also cleanse, conform, deduplicate and normalize your data, ensuring its quality and consistency. Say goodbye to messy, fragmented data, and say hello to synchronized, reliable datasets ready for advanced analytics.
Finally, no matter what format your initial data is in, our newest destination automates the process of converting it into Delta Lake format. Delta Lake provides enhanced reliability, scalability and performance for your lakehouse, enabling efficient query processing and data manipulation with Python, SQL or Scala.
How does Delta Lake on ADLS differ from Fivetran's existing support for Databricks as a destination?
Our new lakehouse offering isn't specific to just Databricks. Many of our customers using the Azure cloud also use Delta Lake along with Azure Synapse Analytics and other native Azure services.
For customers who are planning to land data in Delta Lake on ADLS using Databricks, we made a number of enhancements to this offering that make it different, including:
- Integration with Databricks Unity Catalog
- Reducing ingestion costs by leveraging Fivetran compute instead of Databricks compute, with Fivetran covering the ingestion costs
NOTE: As of July 2024, we support Microsoft Fabric and OneLake as well.
To get started with Delta Lake on Azure as a destination, start a 14-day free trial.