A brittle ETL pipeline, a mix of different code languages and degrading warehouse performance inhibited customer retention analysis. With a modern data stack, Ritual has a 95% reduction in data pipeline issues, a 75% reduction in query times, and a threefold increase in data team velocity. By empowering the business with data, the business has seen a sustained improvement in retention.
- Pipeline: Fivetran
- Sources: Google Sheets, Heroku Postgres, Iterable, Kustomer, Segment, SendGrid, Stripe, Webhooks
- Transformations: dbt
- Destination: Snowflake
- Business Intelligence Tool: Looker
Ritual is an LA-based, direct-to-consumer subscription wellness brand. It currently offers three vitamin products for women: multivitamin, prenatal, and post-menopause, with more products on the way.
Since customers can pause, rush or cancel at any time, retention is a critical aspect of the company’s long-term growth. But the existing data architecture made it difficult to dig into the retention data and make informed business decisions.
An Infrastructure That Didn’t Meet Demand
When Brett Trani, Director of Data and Analytics, joined Ritual in late 2017, the site was hosted on Shopify and a third-party plugin was managing the subscriptions. The quality of the data was inconsistent and the business relied on a nightly snapshot of the subscription, retention and customer data that was stored in the warehouse. While it met the basic reporting needs at the time, the business couldn’t dig any deeper into what drove the numbers it was seeing.
As the company grew, more teams started requesting additional dimensions such as acquisition sources for customers, AB testing flags, and website feature interactions, which were added into the retention tables. While this was exciting for the data team, the increase in demands could not be met with the existing architecture. The three main challenges were:
- Brittle pipeline. Ritual had a brittle ETL pipeline that regularly failed or ran behind schedule. This caused the nightly snapshot job to run with stale data, or not at all, resulting in a blank space where retention data should be in the company’s daily Looker report. “With the failures and data gaps, people lost trust in the data and would go out and find their own sources – spreadsheets, ad platforms, random notes – and end up with different numbers for the same metric,” Trani explains. “There was no single source of truth for retention.”
- A tangle of transformations. Code for the retention table was a mix of different languages spread across Python, SQL and LookML, making it difficult to understand what transformations were being used to get the final number and to update the data models. “Business users became frustrated because seemingly simple requests took an inordinate amount of time to complete,” says Trani. “We were also worried that we were missing opportunities to pick up on emerging trends for promising test results because it was such a slow process.”
- Degrading data warehouse performance. Tables were getting larger and, consequently, warehouse performance was degrading over time. “I spent many nights trying to understand how I could get higher performance out of the tables,” Trani laments. “There were rich retention analyses that we weren’t able to do because running the reports would take the better part of a day.”
Moving to a Modern Data Stack
After extensive research and conversation, Ritual decided to migrate to a modern data stack with Fivetran, dbt and Snowflake. Each piece of the stack solved a different pain point for the business:
- Fivetran. “Fivetran is a fully-automated data pipeline that gracefully handles changes in data sources. We have no more data pipeline failures – it just works. Sometimes I don’t log in for days or weeks, because I really just don’t need to touch it,” Trani explains.
- dbt. With dbt, the business no longer needs to search through random Python scripts and layered LookML files. dbt is the single source of truth for analytics code: in-warehouse transformations reduce complexity and ensure everything is in one place. Additional features like automated testing and data freshness checks help the data team build more confidence in the data. By speeding up the process for testing through transformations, Trani estimates that dbt made a 68% increase in new feature development possible.
- Snowflake. Snowflake enables the business to scale its warehouse seamlessly with the separation of storage and compute. It is easy to separate jobs so processes such as loading and transforming data don’t impact the end-users in Looker.
So how does it all work together? Fivetran extracts and loads transactional data, email events, website interactions, ad platform data and post-purchase feedback into a raw database in Snowflake. dbt snapshots the raw data and transforms it. The data is then run through the business logic and tested before it is put into a separate analytics database, which houses the clean, accurate data ready for business users. Analytics tools query the data within the analytics database, using Looker for BI, Databricks for data science modeling.
Enabling Testing and Data Exploration
The modern data stack has enabled the business to get clean, easy-to-use data into the hands of people across the business and to test and iterate to scale. By using the data to build a stronger business and empower decision-makers, Trani has seen a sustained month over month improvement in retention. He highlights someresults that the stack has enabled:
Ad-hoc data exploration for business users. Through Looker’s explore functionality, teams can easily jump in, identify meaningful insights and accelerate decision-making. With a 75% reduction in query times, individuals confidently run reports in Looker knowing they will get an answer quickly. Examples of questions people can quickly answer include:
- Does a payment type people use at checkout impact their retention?
- Does exposure to a specific AB test pre-purchase have any impact on long-term retention?
- Has the quality of subscribers coming from a channel like paid social changed compared to this time last year?
Cross-functional collaboration. For core business metrics, the entire business has a single source of truth that has bridged the gap between teams. People can connect the entire funnel, from the first website touch all the way through the subscription journey.
“Since switching to Fivetran we’ve seen a 95% reduction in data pipeline issues. Recently, the acquisition team was testing a new channel and wanted to know about the retention of the cohort. In the middle of a growth meeting, they were able to pull up the cohort report, run the retention numbers, and make the necessary decisions to scale the test based on the data. This is only possible because the data is up-to-date, reliable and available for the entire organization.”
Personalizing the customer journey. With reliable, accessible data, the data team collaborated with the lifecycle team to understand what was driving retention. They looked at multiple dimensions to identify at-risk and high-value cohorts to test into different post-purchase experiences:
“We saw that some customers were pausing very early on in their subscription. Since our sales rely on the daily habit of taking two vitamins a day, we figured people were having trouble committing to the habit. So, we identified a cohort of folks pausing their subscription very early on, sent them a personalized content series with different tips and tricks on how to build a habit, and set up the reporting to track the cohort long-term. Over two to three months, there was a slow separation of the group receiving the personalized messaging from the control group, which resulted in better retention and higher LTV.”
Accelerated understanding of the customer. Because Ritual can quickly respond to customer needs and external changes, it has seen a threefold increase in data team velocity and enabled people to quickly build dashboards and conduct tests to understand and improve the customer journey:
“The Director of Lifecycle used retention data to identify minor changes that would add value, including tweaks around customer journey and emailing. With the available data and reports, she could identify the cohorts that would be impacted, model out what each of those cohorts’ incremental impact would be to churn and set up a dashboard to track this over time. She was able to monitor how these small projects impacted churn over time – and she had everything she needed to do it on her own.”
About Fivetran: Shaped by the real-world needs of data analysts, Fivetran technology is the smartest, fastest way to replicate your applications, databases, events and files into a high-performance cloud warehouse. Fivetran connectors deploy in minutes, require zero maintenance, and automatically adjust to source changes — so your data team can stop worrying about engineering and focus on driving insights.
About Snowflake: Snowflake is the leading data warehouse built for the cloud. Its unique architecture delivers proven breakthroughs in performance, concurrency and simplicity. For the first time, multiple groups can access petabytes of data at the same time, up to 200 times faster and 10 times less expensive than solutions not built for the cloud. Snowflake is a fully managed service with a pay-as-you-go-model that works on structured and semi-structured data.
About dbt: dbt is a development environment built and maintained by Fishtown Analytics that speaks the preferred language of data analysts everywhere—SQL. With dbt, analysts take ownership of the entire analytics engineering workflow, from writing data transformation code to deployment and documentation.
About Looker: Looker is a modern platform for data that offers data analytics and business insights to every department at scale, and easily integrates into applications to deliver data directly into the decision-making process.