“The combination of Databricks and Fivetran has enabled us to build a robust and modern data pipeline in a very short amount of time. Fivetran had all the right connectors and integrations we needed.” — Justin Wille, Director of Insights and Analytics, KregTool
- Destination: Databricks Lakehouse Platform running on AWS S3
- Connectors: Salesforce Commerce Cloud, Facebook, Instagram Business, Google Analytics, Mailchimp, Zendesk, MySQL
- Cloud Platform: AWS
- BI: Power BI
- Company size: 230 employees
- Industry: Manufacturing
Iowa-based Kreg Tool was founded just over 30 years ago, growing from a small family business with a single product into a large manufacturer selling multiple models of wood-joining tools in over 60 countries through big box retailers and online.
Before Fivetran & Databricks
Kreg Tool relied on numerous data sources for customer information, but the lack of a central data repository created data silos, making it difficult to obtain the right insights, integrate ERP and workforce management systems, or leverage data from Salesforce. Custom data schemas as well as API development and data ingestion processes resulted in cumbersome access and lengthy programming projects. Multiple ETL platforms added to the complexity.
Kreg Tool’s environment was such that it had little control over its data and had to build individual connectors for each pipeline. The company wanted to transition to a cloud-based platform that would allow it to easily integrate data and scale up as needed. Kreg Tool undertook a rigorous review of the top tools and was wowed by the combination of Fivetran and Databricks. When Kreg Tool’s data team tested the solutions, they managed to get 11 connections operating on Databricks Lakehouse within a few hours — before, it had taken Kreg Tool more than a day to code a single pipeline.
Fivetran & Databricks Solution
With Databricks and Fivetran, Kreg Tool has opened the flood gates for the data analytics team to develop insights and deliver on use cases.
"The biggest thing that shocked me about Databricks Partner Connect was literally within four hours we had Fivetran pipelines bringing in data from Google Analytics, Mailchimp, Facebook, Instagram, MySQL, and Zendesk," said Justin Wille, Director of Insights and Analytics. "I can't imagine how many weeks that would have taken us to do otherwise."
Setting up Fivetran from Databricks Partner Connect, a dedicated portal from within the Lakehouse, only takes a few minutes as there is no need to manually create resources or configure the connection. Fivetran loads data into Kreg Tool’s new Databricks Lakehouse Platform directly, and uses an existing schema so it can be natively queried. This means fewer data pipelines are needed. Fivetran helped Kreg Tool move data from SaaS systems into a Databricks environment enabling the data to sit alongside Kreg Tool’s on-premises systems. Databricks simplified infrastructure management, and made data pipelines fast and highly reliable with Delta Lake. Joining all that data provided greater levels of detail, ease and speed to end users within the business.
"We built a single customer experience solution and as a result we better understand and anticipate our customers’ needs and can help them build great woodworking projects.”
“I don’t have to spend any time managing schema changes or updating data pipelines, so there is a lot less management required. We have a single source of truth and don’t have to track down data sources or find a particular origin application as we did before we implemented Fivetran.” — Justin Wille, Director of Insights and Analytics
- Democratize data and insights for data analysts and engineers, who work collaboratively on personalized products and new go-to-market capabilities.
- Easy setup of data pipelines to ingest SaaS app data
- Data engineering time savings: Using Databricks Partner Connect and Fivetran, the company created first connection in about 45 minutes, with 10 data connections running the first day
- Data ingestion speed reduced from eight weeks to four hours, with no custom coding required
- Automated data pipeline operations after setup
- Ability to ingest data from both first-party sources (such as warehouse management systems and hosted databases) and third-party sources (such as reviewers and social media customer posts)
- Greater personalization for customers
- Increase revenues
- Modern, easy-to use cloud-based platform that can grow with company over time
- Single platform to manage all ETL and batch jobs
- Cost- and support resource-neutral compared to legacy methods (only two BI analysts are available to handle the entire workload)
- Handle a wide variety of data sources, including from shipping and logistics companies, third-party product reviewers, and various social network customer posts on a single lakehouse platform.