Five ways Fivetran lays the foundation for machine learning

Data integration is essential for analytics, regression analysis and your first forays into generative AI.
October 30, 2023

The ability to move data from business applications, operational databases, event streams and files to repositories such as data warehouses and data lakes is the foundational capability that underpins all analytics, from reporting and dashboards to predictive modeling, machine learning and artificial intelligence. Fivetran supports a wide range of data sources allowing teams to not only break down data silos but also to pursue higher-value uses of data – proactively answering important questions about business operations and creating innovative data products.

The following are several practical examples of how you can use Fivetran to connect data from disparate sources together and combine records to solve important business problems using predictive modeling, machine learning and artificial intelligence.

1. Campaign forecasting

Marketing analytics enables marketers to evaluate the success of digital marketing initiatives, identify trends and patterns over time and make data-driven decisions. Content, advertisements, events and other marketing initiatives are usually combined into campaigns to influence the decisions of prospective customers.

Campaign forecasting is thus a must-have for top-tier marketing teams who want full visibility into their efforts and the ability to design campaigns with more intentionality. Databricks, for instance, uses campaign forecasting so that managers can easily evaluate their performance and act on new insights.

Social media and search engines are common channels for advertising; Fivetran supports a multitude of advertising platforms. Popular examples include Facebook Ads, Google Ads, Microsoft Advertising and LinkedIn Ads. Unifying data from these sources also means standardizing the fields and tables from each. This is a considerable transformation task handled directly by our Ad Reporting data model

Likewise, Fivetran supports a number of marketing automation platforms with a heavy emphasis on email marketing, such as Marketo and Hubspot. These, too, have accompanying data models to speed the transition from raw data to usable data assets. Marketo and Hubspot, like many other sources, also support Fivetran Quickstart data models, further accelerating your ability to make data models ready for analysis.

Finally, to track success further down the funnel, you can look at Fivetran data from customer relationship management, enterprise resource planning and point-of-sale platforms such as Salesforce, Netsuite, Shopify and Stripe, most of which are also complemented by data models.

The data from these disparate platforms enables analysts to track all customer interactions and the magnitude of purchases. This data can be fed into a regression model to determine the strongest associations. Regression analysis can be a powerful method for forecasting, allowing you to identify the relationship between spending on various platforms or assets and outcomes. You can also directly evaluate different campaigns head-to-head.

Sources:

  • Advertising platforms
  • Marketing automation platforms
  • CRM/ERP/POS

Suggested model:

  • Multiple linear regression

2. Financial forecasting

Finance analytics are a critical need for organizations across every industry. Forecasting, budget reviews and competitive analysis are key financial questions organizations must address. Consider the case of Intercom, which uses Fivetran to integrate data from subscription platforms, ERPs and payment processing software to automate reports for finance teams and executives.

Financial forecasting forms the basis for critical business decisions surrounding hiring, budgeting, predicting revenue and all manner of planning. Although simple heuristic methods such as percent of sales, straight line and moving average forecasting are well known, along with qualitative methods such as the Delphi method, regression analysis once again offers a more rigorous, interpretable approach that allows analysts to determine strong correlations, if not causality, and identify major leverage points over their financial outcomes.

Sales, profits, cash flow, expenses, growth and similar metrics can all serve as explanatory variables. The data sources used to construct your model will reflect this range, consisting mainly of accounting software such as QuickBooks or Xero, as well as CRMs, ERPs and POSs as previously mentioned. You can use data models for QuickBooks and Xero to quickly convert raw accounting data into ledgers, balance sheets, cash flow statements and other important reports.

Sources:

  • Accounting software
  • CRM/ERP/POS

Suggested model:

  • Multiple linear regression

3. Customer churn prediction

Product analytics involves assessing how users interact with your product. User activities, feature usage and the reception of your product are just the tip of the iceberg. Product analytics is the hub around which all of your other business analytics – finance, customer success, sales, marketing and engineering – all rotate. Fivetran customer Ritual, for instance, combines data from customer support applications, marketing platforms, operational databases and more to improve retention through personalized messaging to customers.

Although many products have some kind of natural expiration date, one outcome that most product teams want to avoid is churn, when a customer prematurely abandons a product. Since churning is a yes/no proposition, a logistic regression model used for classification may be an appropriate approach to predicting future churn.

Many variables can be associated with a customer’s likelihood of churning, ranging from demographics to specific behaviors, interactions with your product and other aspects of the customer lifecycle. This information may be stored in the back end of your application; common examples of such transactional databases include SQL Server, PostgreSQL and MySQL. Specific behaviors and actions may also be tracked through event streams such as Webhooks, Segment and Snowplow.

Sources:

  • Databases
  • Event streams
  • CRM/POS

Suggested model:

  • Logistic regression

4. Recommendation engine

Recommendation engines are an essential part of many platforms, such as streaming services, ecommerce, business review aggregators and even online dating. As products in their own right, they are central to guiding users to goods, services, content and other offerings that stand a high chance of pleasing (and retaining) them. CarOnSale, for example, uses Fivetran to integrate data from Heroku Postgres, SendGrid, Stripe and Freshdesk to build a recommendation engine to recommend cars to buyers based on previous browsing and purchasing behavior.

A common model for recommendation engines is collaborative filtering, in which users are recommended items that other users similar to themselves are known to enjoyed.

The data you will use to build a recommendation engine is most likely to come from your operational systems, including backend databases, CRMs and ERPs. If you aggregate data from external sources, it may involve third-party data as well.

Sources depend on the exact use case, but could encompass:

  • Databases
  • CRM/ERP/POS
  • Third-party data

Suggested model:

  • Collaborative filtering

5. Natural language processing and automated customer support

Generative AI has potential to profoundly transform the world, offering massive productivity boosts to almost every imaginable creative or intellectual endeavor. By creating new media in response to prompts, generative AI can complement important business functions of all kinds, decreasing the time and effort required to ideate, iterate and prototype new content of all kinds.

Many applications involved with customer support and success produce data that is both structured and text-rich. Examples include customer support and ticketing software such as Zendesk and Intercom, as well as dedicated chatbot applications such as Ada and Drift. Correspondence with customers is often captured in CRMs, as well, most notably Salesforce. As with many other complex applications, Fivetran has data models for Zendesk and Intercom as well.

Large language models can perform various natural language processing (NLP) on textual data – summarizing it, extracting tone, sentiments and recurring themes, translation and more. This gives you the opportunity to make sense of the very large volume of semi-structured text produced by your operations without personally reading everything.

With further access to documentation, large language models can also answer domain-specific questions about your company and can form the basis for automated customer support of all kinds.

Sources: 

  • CRMs
  • Customer support and ticketing software
  • Chatbots

Suggested models:

You will need to combine a pre-trained foundation model with a retrieval model. The retrieval model retrieves embeddings from a vector database and submits them as part of a prompt to the foundation model. You can further supplement the retrieval model with a knowledge graph produced from your data catalog.

Data centralization is fundamental

For all our discussion of machine learning and artificial intelligence, it’s important not to lose sight of the importance of combining data from across multiple sources for creating a coherent view of your organization’s operations in the first place. In order to pursue innovative, higher value uses of data, you must first establish a solid foundation of data operations with automation, governance and a proven track record of using data to support decisions.

[CTA_MODULE]

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Five ways Fivetran lays the foundation for machine learning

Five ways Fivetran lays the foundation for machine learning

October 30, 2023
October 30, 2023
Five ways Fivetran lays the foundation for machine learning
Data integration is essential for analytics, regression analysis and your first forays into generative AI.

The ability to move data from business applications, operational databases, event streams and files to repositories such as data warehouses and data lakes is the foundational capability that underpins all analytics, from reporting and dashboards to predictive modeling, machine learning and artificial intelligence. Fivetran supports a wide range of data sources allowing teams to not only break down data silos but also to pursue higher-value uses of data – proactively answering important questions about business operations and creating innovative data products.

The following are several practical examples of how you can use Fivetran to connect data from disparate sources together and combine records to solve important business problems using predictive modeling, machine learning and artificial intelligence.

1. Campaign forecasting

Marketing analytics enables marketers to evaluate the success of digital marketing initiatives, identify trends and patterns over time and make data-driven decisions. Content, advertisements, events and other marketing initiatives are usually combined into campaigns to influence the decisions of prospective customers.

Campaign forecasting is thus a must-have for top-tier marketing teams who want full visibility into their efforts and the ability to design campaigns with more intentionality. Databricks, for instance, uses campaign forecasting so that managers can easily evaluate their performance and act on new insights.

Social media and search engines are common channels for advertising; Fivetran supports a multitude of advertising platforms. Popular examples include Facebook Ads, Google Ads, Microsoft Advertising and LinkedIn Ads. Unifying data from these sources also means standardizing the fields and tables from each. This is a considerable transformation task handled directly by our Ad Reporting data model

Likewise, Fivetran supports a number of marketing automation platforms with a heavy emphasis on email marketing, such as Marketo and Hubspot. These, too, have accompanying data models to speed the transition from raw data to usable data assets. Marketo and Hubspot, like many other sources, also support Fivetran Quickstart data models, further accelerating your ability to make data models ready for analysis.

Finally, to track success further down the funnel, you can look at Fivetran data from customer relationship management, enterprise resource planning and point-of-sale platforms such as Salesforce, Netsuite, Shopify and Stripe, most of which are also complemented by data models.

The data from these disparate platforms enables analysts to track all customer interactions and the magnitude of purchases. This data can be fed into a regression model to determine the strongest associations. Regression analysis can be a powerful method for forecasting, allowing you to identify the relationship between spending on various platforms or assets and outcomes. You can also directly evaluate different campaigns head-to-head.

Sources:

  • Advertising platforms
  • Marketing automation platforms
  • CRM/ERP/POS

Suggested model:

  • Multiple linear regression

2. Financial forecasting

Finance analytics are a critical need for organizations across every industry. Forecasting, budget reviews and competitive analysis are key financial questions organizations must address. Consider the case of Intercom, which uses Fivetran to integrate data from subscription platforms, ERPs and payment processing software to automate reports for finance teams and executives.

Financial forecasting forms the basis for critical business decisions surrounding hiring, budgeting, predicting revenue and all manner of planning. Although simple heuristic methods such as percent of sales, straight line and moving average forecasting are well known, along with qualitative methods such as the Delphi method, regression analysis once again offers a more rigorous, interpretable approach that allows analysts to determine strong correlations, if not causality, and identify major leverage points over their financial outcomes.

Sales, profits, cash flow, expenses, growth and similar metrics can all serve as explanatory variables. The data sources used to construct your model will reflect this range, consisting mainly of accounting software such as QuickBooks or Xero, as well as CRMs, ERPs and POSs as previously mentioned. You can use data models for QuickBooks and Xero to quickly convert raw accounting data into ledgers, balance sheets, cash flow statements and other important reports.

Sources:

  • Accounting software
  • CRM/ERP/POS

Suggested model:

  • Multiple linear regression

3. Customer churn prediction

Product analytics involves assessing how users interact with your product. User activities, feature usage and the reception of your product are just the tip of the iceberg. Product analytics is the hub around which all of your other business analytics – finance, customer success, sales, marketing and engineering – all rotate. Fivetran customer Ritual, for instance, combines data from customer support applications, marketing platforms, operational databases and more to improve retention through personalized messaging to customers.

Although many products have some kind of natural expiration date, one outcome that most product teams want to avoid is churn, when a customer prematurely abandons a product. Since churning is a yes/no proposition, a logistic regression model used for classification may be an appropriate approach to predicting future churn.

Many variables can be associated with a customer’s likelihood of churning, ranging from demographics to specific behaviors, interactions with your product and other aspects of the customer lifecycle. This information may be stored in the back end of your application; common examples of such transactional databases include SQL Server, PostgreSQL and MySQL. Specific behaviors and actions may also be tracked through event streams such as Webhooks, Segment and Snowplow.

Sources:

  • Databases
  • Event streams
  • CRM/POS

Suggested model:

  • Logistic regression

4. Recommendation engine

Recommendation engines are an essential part of many platforms, such as streaming services, ecommerce, business review aggregators and even online dating. As products in their own right, they are central to guiding users to goods, services, content and other offerings that stand a high chance of pleasing (and retaining) them. CarOnSale, for example, uses Fivetran to integrate data from Heroku Postgres, SendGrid, Stripe and Freshdesk to build a recommendation engine to recommend cars to buyers based on previous browsing and purchasing behavior.

A common model for recommendation engines is collaborative filtering, in which users are recommended items that other users similar to themselves are known to enjoyed.

The data you will use to build a recommendation engine is most likely to come from your operational systems, including backend databases, CRMs and ERPs. If you aggregate data from external sources, it may involve third-party data as well.

Sources depend on the exact use case, but could encompass:

  • Databases
  • CRM/ERP/POS
  • Third-party data

Suggested model:

  • Collaborative filtering

5. Natural language processing and automated customer support

Generative AI has potential to profoundly transform the world, offering massive productivity boosts to almost every imaginable creative or intellectual endeavor. By creating new media in response to prompts, generative AI can complement important business functions of all kinds, decreasing the time and effort required to ideate, iterate and prototype new content of all kinds.

Many applications involved with customer support and success produce data that is both structured and text-rich. Examples include customer support and ticketing software such as Zendesk and Intercom, as well as dedicated chatbot applications such as Ada and Drift. Correspondence with customers is often captured in CRMs, as well, most notably Salesforce. As with many other complex applications, Fivetran has data models for Zendesk and Intercom as well.

Large language models can perform various natural language processing (NLP) on textual data – summarizing it, extracting tone, sentiments and recurring themes, translation and more. This gives you the opportunity to make sense of the very large volume of semi-structured text produced by your operations without personally reading everything.

With further access to documentation, large language models can also answer domain-specific questions about your company and can form the basis for automated customer support of all kinds.

Sources: 

  • CRMs
  • Customer support and ticketing software
  • Chatbots

Suggested models:

You will need to combine a pre-trained foundation model with a retrieval model. The retrieval model retrieves embeddings from a vector database and submits them as part of a prompt to the foundation model. You can further supplement the retrieval model with a knowledge graph produced from your data catalog.

Data centralization is fundamental

For all our discussion of machine learning and artificial intelligence, it’s important not to lose sight of the importance of combining data from across multiple sources for creating a coherent view of your organization’s operations in the first place. In order to pursue innovative, higher value uses of data, you must first establish a solid foundation of data operations with automation, governance and a proven track record of using data to support decisions.

[CTA_MODULE]

Related blog posts

Machine learning in one lesson
Data insights

Machine learning in one lesson

Read post
When are you ready for artificial intelligence and machine learning?
Data insights

When are you ready for artificial intelligence and machine learning?

Read post
In-warehouse machine learning and the modern data stack
Data insights

In-warehouse machine learning and the modern data stack

Read post
No items found.
Automating credit card fraud detection with Google BigQuery ML and Fivetran
Blog

Automating credit card fraud detection with Google BigQuery ML and Fivetran

Read post
Accelerate GenAI apps with Fivetran Google Cloud BQ and Vertex AI
Blog

Accelerate GenAI apps with Fivetran Google Cloud BQ and Vertex AI

Read post
How to A/B test product-led growth in B2B SaaS
Blog

How to A/B test product-led growth in B2B SaaS

Read post

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.