At Fivetran, our core business is building and maintaining data connectors at scale so our customers don’t have to. Our central principle is that our connectors just work with minimal intervention from users. There are many challenges involved with delivering and sustaining such a product. Not only is it inherently difficult to build and maintain data pipelines, but the pipelines must serve thousands of customers and cope with a wide range of unique corner cases.

The Fivetran engineering team consists of over 300 developers divided into teams that own specific connectors. This poses the additional challenge of ensuring that teams don’t inadvertently form functional siloes that produce inconsistencies and incompatibilities between disparate Fivetran features and products.

Ideally, all Fivetran connectors should move data the same way, recover from stoppages the same way and generally behave identically from the perspectives of both end users and developers. Consistent and predictable behavior across connectors builds trust with customers and enables developers to quickly debug problems.

Unclear error messages have clear costs

In mid-2023, the Fivetran Product team identified several recurring issues with the accuracy, consistency and helpfulness of error messages that were corroborated by customer feedback. As the team investigated further, it became clear that different connectors were surfacing different error messages for the same errors — a result of engineers from different teams using different phrasing to describe the same issues.

Beyond hurting the user experience and posing a brand risk, poor error handling has tangible monetary costs. Error messages that are difficult for users to act on consume the provider’s support resources and prevent customers from fully utilizing the product.

We, in the Analytics team, partnered with the Product team to identify the most common and impactful errors and analyzed the variation in messaging for each error. The goal was to develop a lightweight solution that would proactively and systematically address these issues and enable the relevant Engineering teams to consolidate and rationalize the messages on the basis of the actual failure modes they represent.

To assemble the relevant data, we combined data from several Fivetran connectors in our central data warehouse – our own Fivetran Platform Connector to obtain sync error logs, the Salesforce connector for account information and the Google Sheets connector for organizational information.

Finding patterns in error messages using NLP

Given both the inconsistencies in error messages and the huge volume and variety of data routinely synced, we had to identify semantically equivalent error messages that had been worded differently but referred to the same failures. We approached this as an unsupervised learning problem – we would identify clusters of error messages and then determine the actual failure states associated with them.

We used Python’s natural language toolkit (NLTK), the open-source YAKE (Yet Another Keyword Extractor) and the pandas packages to query the error logs from our data warehouse and then extract and tokenize all the relevant keywords. With all unique keywords and parts of speech across all error messages accounted for, we eventually wound up with a 755-dimensional space in which to embed our vectors.

We then used a K-means clustering algorithm — chosen for its relative simplicity — to cluster data points based on Euclidean distance, the length of a straight line between two points in an n-dimensional space. After careful analysis and review of the clusters, we named the clusters based on our understanding of the actual underlying failures.

*Note: The left is a 2D conceptual illustration, not a literal representation of the 755D space we used.*

We discovered that “reconnect” failures were the most common and were a frequent problem for popular connectors with large volumes of syncs, including MySQL, Google Ads, Salesforce and YouTube Analytics.

To act on this information, we paired the error messages, now clustered according to actual failures, with customer account information and internal engineering organization information using the aforementioned Salesforce and Google Sheets connectors. We then created a prioritization plan based on customer impact and identified the relevant engineering teams and key stakeholders.

Ready with an easy-to-use dashboard, which consolidated information on the current state of the errors and the customer impact, we partnered with the leaders of the teams originally involved in creating these connectors to develop a framework for rewriting the messages with consistent and helpful language. Given the scope and variety of the connectors Fivetran has built and will continue to build, this remains an ongoing concern.

However, with the help of machine learning models making sense of the consolidated data (made possible by Fivetran’s connectors), we have been able to significantly reduce the time from insight to action for the Engineering org: delivering 122 weeks of saved engineering time and enabling 30-50% increases in error messaging accuracy across the services involved in pilot programs. We continue to iterate on these findings to deliver a high quality product that makes access to data as simple and reliable as electricity.

Subtle machine learning use cases

Discussions of machine learning often focus on high-concept pursuits like autonomous vehicles, image recognition, automated fraud detection and other powerful uses. With the rise of generative AI, rapid content creation and ideation have joined the fray as well.

As our experience shows, however, there are opportunities to use machine learning that are less readily explained but equally valuable. Fundamentally, these subtle but practical machine learning use cases consist of two types:

Pattern recognition problems, solved by programmatically uncovering patterns that have previously escaped notice or gaining deeper insight into known (or suspected) patterns.
Predictive problems, using known factors about what has already happened to make a robust prediction about what can happen, or to classify things into known categories.

If you encounter a problem that meets either of those criteria, as we did here, you may have a solid candidate for machine learning.

[CTA_MODULE]

Data insights

How we use machine learning to improve our product

May 20, 2024

Meera Sharma

Senior Engineering Data Analyst

Fivetran

Anchor Link

Meera Sharma

Senior Engineering Data Analyst

Fivetran

SUJETS

machine learning

Learn how the Fivetran Analytics team uses natural language processing for rationalizing and addressing customer error messages.

Unclear error messages have clear costs

Finding patterns in error messages using NLP

Subtle machine learning use cases

Pattern recognition problems, solved by programmatically uncovering patterns that have previously escaped notice or gaining deeper insight into known (or suspected) patterns.
Predictive problems, using known factors about what has already happened to make a robust prediction about what can happen, or to classify things into known categories.

If you encounter a problem that meets either of those criteria, as we did here, you may have a solid candidate for machine learning.

[CTA_MODULE]

See for yourself how the 500+ connectors Fivetran provides can help you centralize your data and remove silos.