Strava is the social network for athletes to track progress and share results. The twelve-year-old company has seen massive growth in recent years and consistently seeks ways to improve business insights and accelerate data-driven decision-making. By embracing a modern data stack and educating internal stakeholders on the impact of data analytics, Strava’s data team has been able to exceed expectations and foster Strava’s data culture.
At Data Engineer Appreciation Day, Strava’s Data Engineering Lead Daniel Huang, unveiled his data team’s process to scale and meet the growing company’s data needs through automation and reimagining a new data culture.
1. Realize when change is needed
Companies often experience an inflection point in the growth of the data engineering team that has engineers rethinking not just their tools and processes, but their entire data culture, says Huang.
Not unlike the cyclists and runners who track their activity on Strava’s app, Huang’s team has logged plenty of time racing their hardest just to keep up with the needs of the fast-growing, 12-year-old company.
“Where we saw more of our troubles was around the ‘people scale’ side of things,” recalls Huang.
When his team stopped to catch their breath, they envisioned what their company’s data engineering culture should look like in the coming years. “It started with a shift toward a platform,” Huang recalls. “Our role as data engineers should be to build the platform and guide people through it. Let the platform serve the data needs.”
For context, Strava's original data infrastructure is diagramed below.
2. Determine the bottlenecks
Strava certainly wasn’t alone in having a delayed response to growing data needs. A global survey conducted by Dimensional Research shows that 63% of companies still rely on manual scripting, even though companies are moving more data faster than ever. In fact, 72% of organizations now need data to be moved daily, hourly or even every few seconds.
“In the beginning, all of our ETL jobs were authored by a couple of data engineers, and that meant we were maintaining all these jobs. We were on call to fix these jobs instead of building underlying infrastructure or services.”
The small team had just started using a Redshift cluster, a welcome change from their initial MySQL reader instance. But they quickly realized even that solution had a ceiling — queries were still taking too long or failing, and storage was getting expensive.
“Bottom line: We were becoming the interface for data more often than we liked, and we were becoming a bottleneck for the company.”
3. Reach scale via a modern data stack
With a new vision guiding them, Strava’s data team set off down their path to scale data culture, knowing it would take the endurance of a 10K and not a jog around the block. Strava implemented a cloud data stack built on Snowflake as a data warehouse, Tableau as a BI tool, and Fivetran as its data pipeline provider to automatically land data in Snowflake from third-party vendors.
Let's examine a visual map of Strava's modern approach to data management, using the modern data stack.
“We’re still a small team but we have made progress,” says Huang, reflecting today on the midpoint of their journey. “The ease of cloud-based tools has freed up our team to think about the company’s data culture as a whole, and they’ve set up a categorization of internal data users to better understand and meet their needs.”
There’s still room to grow, and their next goals are to improve data democratization and create a centralized data catalog.
Enjoy the ride and recognize your progress
It’s been a rewarding journey, though not necessarily an easy one. But with Strava’s new data stack in place, Huang says, the team’s data engineers are finding their work more rewarding and impactful, and other parts of the business are extending their work to add even more value.
“Pushing for the adoption of new tools takes time and effort,” he notes. “Even though we got plenty of buy-in from everyone, it still takes time. It has to be treated like building a product. You have to sell, seek input, hold trainings, and lead by example.”