Build vs. Buy: Some Back-of-the-Envelope Costs

A few simple calculations illustrate why it's ill-advised to build your own data pipeline.
November 1, 2018

Note: We have additional, newer materials on this subject! Check out Build vs. Buy: An Illustrated Guide. You can also read a pair of infographics here and here.

There's a good chance your company is among the more than 150,000 global customers of Salesforce. Chances are also strong that you use services like Marketo, Zendesk, Jira and Zuora to gain a comprehensive view of your business operations.

The research and advisory firm Gartner estimates that 70% to 80% of business intelligence efforts fail, in part because of outdated technology, clunky processes and inaccessibility. The path to failure can be expensive, too, and the goal of this article is to present some of the costs in money, time and anguish associated with building a bespoke business intelligence solution.

Building Your Own Data Pipeline Connectors Is Complicated and Expensive

Suppose you use the five connectors listed above, and want to build connectors to automatically ingest the API endpoints and store the data in a data warehouse. Here are some slightly optimistic calculations for monetary costs:

Each of five connectors will take about five weeks for an engineer to build:

(5 connectors) * (5 person-weeks)

Based on what Fivetran has found through previous experience, each connector will likely need a dedicated week of maintenance work per quarter, adding up to four weeks per year:

(5 connectors) * (5 person-weeks + 4 person-weeks)

(5 connectors) * (9 person-weeks) = 45 person-weeks

That makes 45 weeks out of 52 weeks in a year. Multiply that fraction by a typical software engineer salary ($120,000) to see how much building your own connectors will cost during your first year:

(45 weeks/52 weeks) * ($120,000) = $103,846.15

In subsequent years, your engineer will continue to update each quarter (four weeks) and handle bugs and edge cases as they crop up (one week), for a total of five person-weeks of work per connector.

(5 connectors) * (5 person-weeks) = 25 person-weeks

That makes 25 weeks out of 52 weeks in a year, all dedicated to ongoing maintenance. This is how much maintaining your own connectors will cost in subsequent years:

(25 weeks/52 weeks) * ($120,000) = $57,692.31

Based on our consumption-based pricing model, we at Fivetran have found that customers using a complement of five connectors will virtually always spend less than either of the figures above.

Manual Reporting Takes Too Long and Turns Your Analysts Into Bottlenecks

The alternative to constructing a sophisticated and maintenance-intensive business intelligence and data science infrastructure is to assemble reports and analyses manually. An analyst from one of our customers estimated that their manual reports routinely took “a month,” or 160 hours of work.

Consider the following workflow:

120 hours

  • Collect files manually (spreadsheets, CSVs, JSON files)
  • Consult managers
  • Wait on replies
  • Run API ingestion scripts

40 hours

  • Clean, format, and transform data
  • Perform analysis
  • Build visualizations
  • Write report

= 160 hours

Such a lengthy workflow effectively limits the frequency of reports and findings, needlessly consumes an analyst’s time and makes simple metrics inaccessible to the business users who need them to make decisions.

When our customer switched to Fivetran, the time commitment for each report shrank to less than a week, more than quadrupling the speed at which the company could make data-driven decisions. Another customer similarly lopped off 140 hours of work every week and estimates that they gained a 200% ROI by switching to Fivetran.

Now, the workflow looks more like:

40 hours

  • Clean, format, and transform data
  • Perform analysis
  • Build visualizations
  • Write report

= 40 hours

Furthermore, there is now time to dedicate to more sophisticated analyses. The flow could be something like this:

60 hours

  • Clean, format, and transform data
  • Perform analysis
  • Build visualizations
  • Machine learning and statistical modeling
  • Write report

= 60 hours

Those of us who are familiar with agile methodology, OODA loops, or competitive esports understand that it is always advantageous to make informed decisions more rapidly. Imagine only being able to act on data once a month!

Morale Is Hard to Quantify but Very Important

If you want to keep your analysts, engineers and managers happy, you should consider the following problems associated with building your own connectors or manual reporting:

  1. Diversion from other software engineering, data science or analytics duties — this is a very common irritant among new data scientists at understaffed organizations and leads to turnover
  2. Frustration and exhaustion from the complexity of maintaining data integrity, particularly by persons lacking the appropriate training
  3. Continually increasing complexity (and downtime) as additional sources of data are added
  4. Misguided decisions caused by lags between requests for business intelligence and delivery of actionable insights — insights that might be stale by the time they arrive

An engineer friend of mine jokes that “database maintenance” is the worst chore he has ever encountered in his career.

It isn’t a very funny joke.

Learning Curves Can Be Steep; Let the Experts Handle It

The five-week estimates I introduced earlier refer to APIs that are straightforward or user-friendly. But not all APIs are straightforward: some ignore best practices, some are poorly documented and some are just very complex.

One of the most popular connectors Fivetran offers is the NetSuite connector. It took Fivetran six months to build out the initial version. The second iteration took a year, and the third took yet another a year, for a total of 2.5 years. Only then did we have a truly mature, stable, and well-functioning piece of software.

The Netsuite connector is popular largely because so many companies attempting to use the API have been stymied by its complexity. The DIY approach to the NetSuite connector is not advisable. At Fivetran, on the other hand, we have people who have spent the last two and a half years thinking about how to crack this particular nut.

Make the Division of Labor Work in Your Favor

The division of labor is directly responsible for humanity’s greatest commercial, scientific and technological accomplishments.

But many of the data engineering skills necessary to construct data pipelines are not formally taught in academic programs, boot camps, or training programs. It is scarce human capital that is often developed the hard and expensive way — through experience, trial and error. People in adjacent roles — analysts, software engineers and data scientists — often find themselves performing these duties poorly and against their druthers.

Given the value of labor specialization, there’s no reason for you and the thousands of other companies using Salesforce, Marketo, and other software to build your own API connectors when an off-the-shelf solution exists. Fivetran has already scaled the learning curve for you so that you can spend your time and energy building your core product and making sense of your operations.

Experience the benefits of an automated data pipeline firsthand with a free trial of Fivetran.

Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.