How to automate SAP ERP data movement into Databricks with Fivetran

How to make siloed SAP data accessible with automated data integration and a data intelligence platform.
July 19, 2024

I recently had the pleasure of attending Data + AI Summit, which was packed with members of the data community –16,000 attendees from 140 countries– and loaded with major announcements and perspectives, including the upcoming Databricks + Tabular with complete Delta and Iceberg interoperability, Unity Catalog becoming open sourced, using proprietary data for Gen AI plus much more. Check out the recap here.

During Ali’s keynote, he called out that one of the key problems he hears from every CIO is the need to reduce data estate fragmentation, get rid of silos and eliminate complexity. Not surprisingly, one of the top requests we received from everyone visiting our booth was, “How can you help me move my SAP ERP data (a traditional, fragmented data silo) into Databricks for all data workloads and unlock my SAP data for all my data use cases?”

Many organizations are still building time-consuming, custom DIY pipelines while others are using supposed lower-code solutions but still struggling to deliver time to value from their SAP data.

In this post, I'll walk you through how Fivetran’s automated data movement platform, combined with the Databricks Data Intelligence Platform, accelerates, democratizes and standardizes SAP ERP data into Databricks for a wide array of data workloads allowing you to quickly unlock data innovation.

The power of Fivetran and Databricks

If you use Databricks as the foundation for all your data workloads, Fivetran can act as the bridge between nearly 600 data sources and the Databricks platform, ensuring fully automated, reliable and secure data movement as well as change data capture and schema management. Data that Fivetran moves is always a faithful representation of the data source and is high quality, trusted, organized, understandable and ready for all Databricks workloads.

Fivetran moves data into Databricks and powers innovation across all workloads

Fivetran deployment options for SAP ERP as a source

This post is focused on one of Fivetran’s SaaS SAP source deployment options, the SAP ERP for HANA connector, but Fivetran offers other deployment options as well, including self-hosted HVR, which can be deployed behind your firewall and within your own VPC and provides continuous replication from SAP. 

Fivetran options for SAP ERP as a source

Setting up the SAP ERP for HANA connector

The SAP ERP for HANA connector features application-based replication, with both the initial sync and incremental CDC initiated via an SAP NetWeaver remote function call. It supports SAP HANA Enterprise Cloud and RISE Private and can work with any SAP license type, either enterprise or runtime. As I'll show you, setting it up is exceptionally fast and easy.

Fivetran SaaS SAP ERP on HANA connector

Getting SAP data flowing into Databricks

Most customers have a wide range of data sources flowing into Databricks, including other SAP ERP sources, Workday, ServiceNow, SQL Server, Oracle, Salesforce and many more supporting a range of data workloads. With Fivetran, you can utilize the latest innovations from Databricks to manage this data, such as Databricks Serverless, Delta table format and Unity Catalog. 

If you check out Ali’s DAIS keynote in a bit more detail, he talks about decoupling storage from compute and storing your data in an open table format (Delta or Iceberg) in cheap hard drives in the cloud in a basic data lake like S3 or ADLS that you pay for independently. Fivetran supports those managed data lake options as well for both Delta or Iceberg table formats.

Fivetran supports a range of Databricks and Data Lake destinations

Configuring the connection to SAP

Fivetran's SAP ERP for HANA connector is straightforward to set up. After downloading and installing the required Fivetran transport in your SAP NetWeaver environment, you simply name your destination schema. Fivetran handles schema creation and schema drift management in the background.

Provide the SAP NetWeaver host identifier, username and other credentials for Fivetran authentication. You can connect via SSH, reverse SSH or VPN, with all data encrypted both in motion and at rest. After hitting "Save and Test," Fivetran confirms your SSH key and runs checks on the SAP ERP system.

Setup is quick and easy for the SAP ERP on HANA connector

Selecting and syncing Tables

With over 100,000 tables in SAP ERP, you select the ones you want to create your dataset. I’m using SAP materials management tables, which store and manage materials, inventory, procurement and vendor interactions.

Select the SAP ERP tables you want to sync to Databricks

That’s it. Once the table selections are saved, the initial sync begins, and CDC is automatically set up for continuous updates. This process is entirely no-code, and allows you to set the sync schedule as needed. Fivetran defaults to every six hours, but you can select anything from 1 minute incremental syncs up to every 24 hours.

Select an incremental change data capture frequency

On the schema tab, you can view your current dataset, add more tables, block tables or columns, change table sync mode and hash columns for additional PII data privacy if needed. Fivetran’s UI makes it easy to manage your data security and privacy requirements.

Continue to add to the existing dataset or select additional data security and privacy options

Viewing data in Databricks

Here’s a glimpse into the Databricks Unity Catalog destination where Fivetran moved the SAP ERP data. The Unity Catalog that this Fivetran Databricks destination was configured with (ts-catalog-demo) shows all datasets for this destination. 

Below you can see the new SAP ERP schema that Fivetran created along with the four tables that were selected and which Fivetran moved over to Databricks. Fivetran is providing the SAP data in the Databricks bronze layer immediately ready to be further enriched or processed. I also like the AI that Databricks has integrated into Unity Catalog. The AI here suggested comments for table descriptions which are very accurate and make it quick and easy to provide more context for my new data set.

Databricks Unity Catalog with the new SAP ERP dataset from Fivetran

Fivetran Transformations

Fivetran offers three integrated options for transformations: Quickstart data models, integration with your dbt Core project or integration with your dbt Cloud service. There are over 50 connectors with Quickstart data models, including the SAP dbt package, which comes with pre-packaged models.

Go from bronze to gold immediately with Fivetran Quickstart Transformations

Creating dashboards in Databricks using the new SAP ERP data

In Databricks, you can create impactful dashboards using an extremely capable AI assistant. It generates SQL queries in the background based on your specifications. For example, you can quickly create visualizations like total material inventory, unique material types and materials described in multiple languages. All I need to do is tell the assistant what visualization I want created. 

Metrics automatically pop up for me to use. I can accept, reject or change, whatever I need based on my downstream data product requirements. Databricks visualizations also have a number of other options if you want to mix and match on your dashboard. 

AI-enabled Databricks dashboards with new SAP ERP data visualizations

Easy connection with Databricks Partner Connect

Don’t forget the Databricks Partner Connect easy button. Connecting your Databricks account to Fivetran is simple with Databricks Partner Connect. Just navigate to Partner Connect in Databricks, click on the Fivetran tile and connect to Fivetran or start a free 14-day trial. From there, you can begin moving data into Databricks immediately from any of Fivetran's almost 600 connectors.

Databricks Partner Connect is the easy button to begin using Fivetran with Databricks

Get started now

It only takes a few minutes to connect to an SAP ERP source and then replicate and move data from an SAP system to the Databricks Data Intelligence Platform using Fivetran. Incremental change data capture and all associated schema drift is set up and handled automatically giving you up-to-date, useable data accessed via your Unity Catalog. I’d encourage you to start your journey today and see how easy it is to connect any data source to the Databricks Data Intelligence Platform with Fivetran and reduce your data estate fragmentation.

You can check out an end-to-end video of this solution with SAP, Fivetran and Databricks on YouTube below.

It would be great to hear from you on any connectors, data workloads and industry use cases you’d like to see profiled next. Take care!

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

How to automate SAP ERP data movement into Databricks with Fivetran

How to automate SAP ERP data movement into Databricks with Fivetran

July 19, 2024
July 19, 2024
How to automate SAP ERP data movement into Databricks with Fivetran
How to make siloed SAP data accessible with automated data integration and a data intelligence platform.

I recently had the pleasure of attending Data + AI Summit, which was packed with members of the data community –16,000 attendees from 140 countries– and loaded with major announcements and perspectives, including the upcoming Databricks + Tabular with complete Delta and Iceberg interoperability, Unity Catalog becoming open sourced, using proprietary data for Gen AI plus much more. Check out the recap here.

During Ali’s keynote, he called out that one of the key problems he hears from every CIO is the need to reduce data estate fragmentation, get rid of silos and eliminate complexity. Not surprisingly, one of the top requests we received from everyone visiting our booth was, “How can you help me move my SAP ERP data (a traditional, fragmented data silo) into Databricks for all data workloads and unlock my SAP data for all my data use cases?”

Many organizations are still building time-consuming, custom DIY pipelines while others are using supposed lower-code solutions but still struggling to deliver time to value from their SAP data.

In this post, I'll walk you through how Fivetran’s automated data movement platform, combined with the Databricks Data Intelligence Platform, accelerates, democratizes and standardizes SAP ERP data into Databricks for a wide array of data workloads allowing you to quickly unlock data innovation.

The power of Fivetran and Databricks

If you use Databricks as the foundation for all your data workloads, Fivetran can act as the bridge between nearly 600 data sources and the Databricks platform, ensuring fully automated, reliable and secure data movement as well as change data capture and schema management. Data that Fivetran moves is always a faithful representation of the data source and is high quality, trusted, organized, understandable and ready for all Databricks workloads.

Fivetran moves data into Databricks and powers innovation across all workloads

Fivetran deployment options for SAP ERP as a source

This post is focused on one of Fivetran’s SaaS SAP source deployment options, the SAP ERP for HANA connector, but Fivetran offers other deployment options as well, including self-hosted HVR, which can be deployed behind your firewall and within your own VPC and provides continuous replication from SAP. 

Fivetran options for SAP ERP as a source

Setting up the SAP ERP for HANA connector

The SAP ERP for HANA connector features application-based replication, with both the initial sync and incremental CDC initiated via an SAP NetWeaver remote function call. It supports SAP HANA Enterprise Cloud and RISE Private and can work with any SAP license type, either enterprise or runtime. As I'll show you, setting it up is exceptionally fast and easy.

Fivetran SaaS SAP ERP on HANA connector

Getting SAP data flowing into Databricks

Most customers have a wide range of data sources flowing into Databricks, including other SAP ERP sources, Workday, ServiceNow, SQL Server, Oracle, Salesforce and many more supporting a range of data workloads. With Fivetran, you can utilize the latest innovations from Databricks to manage this data, such as Databricks Serverless, Delta table format and Unity Catalog. 

If you check out Ali’s DAIS keynote in a bit more detail, he talks about decoupling storage from compute and storing your data in an open table format (Delta or Iceberg) in cheap hard drives in the cloud in a basic data lake like S3 or ADLS that you pay for independently. Fivetran supports those managed data lake options as well for both Delta or Iceberg table formats.

Fivetran supports a range of Databricks and Data Lake destinations

Configuring the connection to SAP

Fivetran's SAP ERP for HANA connector is straightforward to set up. After downloading and installing the required Fivetran transport in your SAP NetWeaver environment, you simply name your destination schema. Fivetran handles schema creation and schema drift management in the background.

Provide the SAP NetWeaver host identifier, username and other credentials for Fivetran authentication. You can connect via SSH, reverse SSH or VPN, with all data encrypted both in motion and at rest. After hitting "Save and Test," Fivetran confirms your SSH key and runs checks on the SAP ERP system.

Setup is quick and easy for the SAP ERP on HANA connector

Selecting and syncing Tables

With over 100,000 tables in SAP ERP, you select the ones you want to create your dataset. I’m using SAP materials management tables, which store and manage materials, inventory, procurement and vendor interactions.

Select the SAP ERP tables you want to sync to Databricks

That’s it. Once the table selections are saved, the initial sync begins, and CDC is automatically set up for continuous updates. This process is entirely no-code, and allows you to set the sync schedule as needed. Fivetran defaults to every six hours, but you can select anything from 1 minute incremental syncs up to every 24 hours.

Select an incremental change data capture frequency

On the schema tab, you can view your current dataset, add more tables, block tables or columns, change table sync mode and hash columns for additional PII data privacy if needed. Fivetran’s UI makes it easy to manage your data security and privacy requirements.

Continue to add to the existing dataset or select additional data security and privacy options

Viewing data in Databricks

Here’s a glimpse into the Databricks Unity Catalog destination where Fivetran moved the SAP ERP data. The Unity Catalog that this Fivetran Databricks destination was configured with (ts-catalog-demo) shows all datasets for this destination. 

Below you can see the new SAP ERP schema that Fivetran created along with the four tables that were selected and which Fivetran moved over to Databricks. Fivetran is providing the SAP data in the Databricks bronze layer immediately ready to be further enriched or processed. I also like the AI that Databricks has integrated into Unity Catalog. The AI here suggested comments for table descriptions which are very accurate and make it quick and easy to provide more context for my new data set.

Databricks Unity Catalog with the new SAP ERP dataset from Fivetran

Fivetran Transformations

Fivetran offers three integrated options for transformations: Quickstart data models, integration with your dbt Core project or integration with your dbt Cloud service. There are over 50 connectors with Quickstart data models, including the SAP dbt package, which comes with pre-packaged models.

Go from bronze to gold immediately with Fivetran Quickstart Transformations

Creating dashboards in Databricks using the new SAP ERP data

In Databricks, you can create impactful dashboards using an extremely capable AI assistant. It generates SQL queries in the background based on your specifications. For example, you can quickly create visualizations like total material inventory, unique material types and materials described in multiple languages. All I need to do is tell the assistant what visualization I want created. 

Metrics automatically pop up for me to use. I can accept, reject or change, whatever I need based on my downstream data product requirements. Databricks visualizations also have a number of other options if you want to mix and match on your dashboard. 

AI-enabled Databricks dashboards with new SAP ERP data visualizations

Easy connection with Databricks Partner Connect

Don’t forget the Databricks Partner Connect easy button. Connecting your Databricks account to Fivetran is simple with Databricks Partner Connect. Just navigate to Partner Connect in Databricks, click on the Fivetran tile and connect to Fivetran or start a free 14-day trial. From there, you can begin moving data into Databricks immediately from any of Fivetran's almost 600 connectors.

Databricks Partner Connect is the easy button to begin using Fivetran with Databricks

Get started now

It only takes a few minutes to connect to an SAP ERP source and then replicate and move data from an SAP system to the Databricks Data Intelligence Platform using Fivetran. Incremental change data capture and all associated schema drift is set up and handled automatically giving you up-to-date, useable data accessed via your Unity Catalog. I’d encourage you to start your journey today and see how easy it is to connect any data source to the Databricks Data Intelligence Platform with Fivetran and reduce your data estate fragmentation.

You can check out an end-to-end video of this solution with SAP, Fivetran and Databricks on YouTube below.

It would be great to hear from you on any connectors, data workloads and industry use cases you’d like to see profiled next. Take care!

Topics
Share

Articles associés

Try out the Fivetran dbt™ package for SAP
Product

Try out the Fivetran dbt™ package for SAP

Lire l’article
Fivetran simplifies SAP data integration to cloud
Product

Fivetran simplifies SAP data integration to cloud

Lire l’article
How replicating SAP data benefits manufacturing organizations
Data insights

How replicating SAP data benefits manufacturing organizations

Lire l’article
No items found.
How to use Fivetran and Databricks to move data and innovate
Blog

How to use Fivetran and Databricks to move data and innovate

Lire l’article
Fivetran at Databricks Data + AI Summit 2024: Key takeaways
Blog

Fivetran at Databricks Data + AI Summit 2024: Key takeaways

Lire l’article
Unifying manufacturing data with Fivetran and Databricks
Blog

Unifying manufacturing data with Fivetran and Databricks

Lire l’article

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.