Source-side filtering with Fivetran

Cloud Functions will revolutionize your data pipeline.

Fivetran Cloud Function Connectors are the key to resolving data integration challenges posed by data sources that aren’t supported by off-the-shelf connectors. But let's break it down into a few key concepts.

What is a Cloud Function Connector? 

In plain terms, a Cloud Function Connector is custom code orchestrated and maintained by Fivetran. Think of the "cloud function" as the custom code and the "connector" as Fivetran's management of it. They leverage serverless platforms like AWS Lambda, Google Cloud Functions or Azure Functions for executing code, and Fivetran takes care of everything else such as invocation, state management and data normalization to name a few. 

Cloud functions can be written in many different languages and respective versions (runtimes), including Python, node.js, Ruby and more. See the documentation from your specific Cloud service provider for more information on what runtimes are supported:

When to Use a Cloud Function Connector?

Fivetran's Cloud Function Connectors are incredibly valuable when there isn't a native Fivetran connector for your data source. While you can request custom connectors, Cloud Functions provide more flexibility and faster implementation. With source access obtained, cloud functions can help you start getting source data in your warehouse in just a few hours.

While cloud functions are excellent for supporting sources that Fivetran does not natively support, cloud functions can also be used for any number of use cases for any data source that can be accessed via code, including:

  • Enhancing existing Fivetran connectors with additional data
  • In-flight transformations of data such that it lands in your destination analytics-ready
  • Complex pipelines that pull data from multiple sources
  • In-flight data quality and integrity checks
  • External orchestration
  • Upgrade legacy pipelines to a modern standard
  • Filtering to extract only a subset of data from a given source

Source-side filtering 

One powerful use case is source-side filtering. Imagine extracting only the data you need from a source, reducing the volume of data transferred and stored. This helps during development and testing and ultimately saves resources. Use cases include:

  • Data science & ML
  • Third-party management
  • Data sharing and monetization

For example, we can load data from disparate sources using our Cloud Function and filter the data from each source to the pertinent data and date range necessary. We then use that data to train an ML model to process x task. With infrastructure like this, we can scale into our Modern Data Stack environment as the Cloud Function connector allows the targeting of non-native sources. This solution could be used to move data-sharing or monetization projects forward as the filtered source data can be tailored to data extract specifications or contractual requirements, helping you get the most out of your data.

Conclusion

Fivetran Cloud Function Connectors are a powerful solution for large international enterprises to streamline data integration, reduce costs and boost productivity. By leveraging source-side filtering and customization, you can optimize your data pipelines for today and the solutions of tomorrow. The level of customization Cloud Functions provides will help data science, sharing and monetization projects throughout the MDS environment. Today, we have covered the What, When, Why and How, all you need to decide is When. With over 500 prebuilt connectors and growing, Fivetran ensures that you have the tools you need to orchestrate your data success. In a time where data agility and efficiency are paramount, Fivetran is your trusted partner.

For practical instructions on source-side filtering and other Cloud Function Connector use cases, check out our community article.

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

Source-side filtering with Fivetran

Source-side filtering with Fivetran

March 20, 2024
March 20, 2024
Source-side filtering with Fivetran
Cloud Functions will revolutionize your data pipeline.

Fivetran Cloud Function Connectors are the key to resolving data integration challenges posed by data sources that aren’t supported by off-the-shelf connectors. But let's break it down into a few key concepts.

What is a Cloud Function Connector? 

In plain terms, a Cloud Function Connector is custom code orchestrated and maintained by Fivetran. Think of the "cloud function" as the custom code and the "connector" as Fivetran's management of it. They leverage serverless platforms like AWS Lambda, Google Cloud Functions or Azure Functions for executing code, and Fivetran takes care of everything else such as invocation, state management and data normalization to name a few. 

Cloud functions can be written in many different languages and respective versions (runtimes), including Python, node.js, Ruby and more. See the documentation from your specific Cloud service provider for more information on what runtimes are supported:

When to Use a Cloud Function Connector?

Fivetran's Cloud Function Connectors are incredibly valuable when there isn't a native Fivetran connector for your data source. While you can request custom connectors, Cloud Functions provide more flexibility and faster implementation. With source access obtained, cloud functions can help you start getting source data in your warehouse in just a few hours.

While cloud functions are excellent for supporting sources that Fivetran does not natively support, cloud functions can also be used for any number of use cases for any data source that can be accessed via code, including:

  • Enhancing existing Fivetran connectors with additional data
  • In-flight transformations of data such that it lands in your destination analytics-ready
  • Complex pipelines that pull data from multiple sources
  • In-flight data quality and integrity checks
  • External orchestration
  • Upgrade legacy pipelines to a modern standard
  • Filtering to extract only a subset of data from a given source

Source-side filtering 

One powerful use case is source-side filtering. Imagine extracting only the data you need from a source, reducing the volume of data transferred and stored. This helps during development and testing and ultimately saves resources. Use cases include:

  • Data science & ML
  • Third-party management
  • Data sharing and monetization

For example, we can load data from disparate sources using our Cloud Function and filter the data from each source to the pertinent data and date range necessary. We then use that data to train an ML model to process x task. With infrastructure like this, we can scale into our Modern Data Stack environment as the Cloud Function connector allows the targeting of non-native sources. This solution could be used to move data-sharing or monetization projects forward as the filtered source data can be tailored to data extract specifications or contractual requirements, helping you get the most out of your data.

Conclusion

Fivetran Cloud Function Connectors are a powerful solution for large international enterprises to streamline data integration, reduce costs and boost productivity. By leveraging source-side filtering and customization, you can optimize your data pipelines for today and the solutions of tomorrow. The level of customization Cloud Functions provides will help data science, sharing and monetization projects throughout the MDS environment. Today, we have covered the What, When, Why and How, all you need to decide is When. With over 500 prebuilt connectors and growing, Fivetran ensures that you have the tools you need to orchestrate your data success. In a time where data agility and efficiency are paramount, Fivetran is your trusted partner.

For practical instructions on source-side filtering and other Cloud Function Connector use cases, check out our community article.

Related blog posts

Serverless ETL With Cloud Functions
Data insights

Serverless ETL With Cloud Functions

Read post
How to Write Cloud Functions in Python
Product

How to Write Cloud Functions in Python

Read post
Use Lambda Functions to Move Data From Redshift
Data insights

Use Lambda Functions to Move Data From Redshift

Read post
Fivetran wins three major Partner of the Year awards
Blog

Fivetran wins three major Partner of the Year awards

Read post
How to Write Cloud Functions in Python
Blog

How to Write Cloud Functions in Python

Read post
Serverless ETL With Cloud Functions
Blog

Serverless ETL With Cloud Functions

Read post
Fivetran wins three major Partner of the Year awards
Blog

Fivetran wins three major Partner of the Year awards

Read post
How to Write Cloud Functions in Python
Blog

How to Write Cloud Functions in Python

Read post
Serverless ETL With Cloud Functions
Blog

Serverless ETL With Cloud Functions

Read post

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.