At Fivetran, we understand the complexities and challenges of managing data lakes. That’s why we’re excited to introduce our latest innovation: Fivetran Managed Data Lake Service. This new offering is designed to automate and streamline your data lake management, allowing you to focus on what truly matters: making use of your data and driving innovation. Fivetran Managed Data Lake Service is currently available on Amazon S3, Azure Data Lake Storage (ADLS), and Microsoft OneLake.
Fivetran Managed Data Lake Service helps transform traditionally ungoverned data lakes into organized, governed, continuously optimized data stores. With native integrations with data catalogs, including AWS Glue, Databricks Unity Catalog, and Polaris Catalog, users can quickly discover, access, and govern key datasets from the lake. From there, users can query and modify the data with Python, SQL, or other supported languages by leveraging compatible compute engines like Databricks, Snowflake, Starburst, or Redshift. Or, they can transform the data with tools like dbt, visualize it with Power BI, or build and deploy AI/ML models with tools like AWS Sagemaker, Azure Machine Learning, or Databricks Mosaic AI.
The power of a managed data lake
Data lakes are critical for organizations looking to leverage big data for analytics, machine learning, and AI. However, the upkeep of a data lake — handling data ingestion, ensuring data quality, managing schema changes, and optimizing performance — can be resource-intensive and complicated. Recognizing these challenges, Fivetran has developed a service that not only simplifies these tasks but also transforms data lakes from cumbersome data stores into dynamic, efficient, and governed data environments.
Fivetran Managed Data Lake Service automatically integrates data from over 700 pre-built or custom sources, then normalizes, compacts, and deduplicates it before landing it in your data lake in Delta Lake or Apache Iceberg open table formats. By automating this conversion, we provide features typical of data warehouses, such as ACID transactions and scalable metadata handling, directly on the data lake. From there, we continuously monitor and maintain your data lake, handling updates, merges, and deletes, ensuring it’s always optimized, up-to-date, and query-ready.
This level of automation and maintenance is crucial for many organizations. As Nick Chmura, Head of Data at Luma Financial Technologies, explains, “Automated table maintenance is the killer feature for us with Fivetran because we have so many different source connectors. To try to build change data capture and manage that for everything…would be prohibitively costly in terms of time.”
Key features and benefits
- Automated data integration: Fivetran supports ingestion from over 700 applications, databases, files, and event data sources, enabling seamless integration into any major data lake destination. This ensures that all your data is consolidated, organized, and easily accessible. Plus, Fivetran covers the costs of ingestion into your data lake, greatly reducing your TCO.
- Data standardization on open table formats: By normalizing and standardizing your data into query-ready open table formats (Apache Iceberg or Delta Lake), we make it easier for you to perform analytics and gain insights without the hassle and compute cost of manually converting data to a standard format.
- Continuous maintenance: Fivetran handles all aspects of ongoing data lake maintenance, from schema evolution to performance optimization. This ensures your data lake is always up-to-date and functioning at its best.
- Robust governance tools: With built-in data governance features and native integrations with popular data catalogs, your data is not only well-managed but also compliant with industry standards and regulations like GDPR.
“We are very excited about Fivetran supporting Delta Lake as a direct destination,” said Himanshu Raja, Director of Product, Databricks. “With this new capability, customers can now use Fivetran to build an open lakehouse with Delta Lake powered by the Databricks Data Intelligence Platform. We are also very excited about the upcoming Fivetran integration with Unity Catalog to provide out-of-the-box governance and security for all Fivetran-generated tables.”
We're eager for you to try the new Managed Data Lake Service, but it's not a perfect fit for everyone. If your organization relies primarily on real-time streaming data with sub-second latencies, or if you prefer not to use an open table format like Delta Lake or Iceberg, this service may not be the ideal choice. However, we encourage you to get in touch with us — we have other data lake options that may better align with your requirements.
Ready to experience the future of data lake management?
With Fivetran Managed Data Lake Service, we're making data as accessible and reliable as electricity, empowering businesses to unlock new opportunities and drive innovation.
As data continues to be a pivotal asset for businesses, managing it efficiently and effectively becomes crucial. We fully automate and manage data standardization as we move it to data lake destinations, making it available to businesses to find new ways to innovate with data.
Summer at the lakehouse
Now, Fivetran users can try our Managed Data Lake Service with free usage from June through August. Connectors set up to new data lake destinations will be eligible for this summer promotion*.
To take advantage of this promotion, you need to:
- Have a Fivetran account in good standing, and
- Create a new connector with S3, ADLS, or OneLake as the destination during the Promotion Period (between June 1, 2024 at 00:01am UTC and August 31, 2024 at 11:59pm UTC).
To get started, head straight to your Fivetran dashboard, sign up for a 14 day free trial of Fivetran or reach out to sales@fivetran.com with any questions.