Shutterstock builds AI-ready Open Data Infrastructure with Fivetran

Company size
500-1999
Region
North America
Industry
Sports, media & entertainment
Key results
  • Reduced data integration time from 1 sprint to minutes
  • Cut pipeline issue resolution time from weeks to days
  • Enabled near real-time reporting 
  • Built a future-proof foundation that can grow with the business and support new use cases without re-architecting
“An Open Data Infrastructure gives us the flexibility to use the right tool for the job. Whether it’s Snowflake, AWS services, or something new in the future, our data is already where it needs to be without having to move or rebuild.”
–  Jitesh Kumar, Senior Software Development Manager at Shutterstock

Shutterstock is a leading global creative platform, providing high-quality images, video, and music to millions of customers worldwide. Behind the scenes, the business runs on a distributed architecture of microservices that power content delivery, subscriptions, licensing, and global commerce at scale.

Supporting this ecosystem requires a highly reliable data foundation. Shutterstock’s data warehouse team is responsible for delivering trusted data for internal reporting, product analytics, and financial reporting, including revenue recognition used in SEC filings.

As the business scaled, the team saw an opportunity to evolve its data architecture to better match that growth. Data was generated across dozens of microservices and systems, alongside critical business inputs stored in tools like Google Sheets. While this enabled flexibility for the business, integrating and governing that data required significant engineering effort. At the same time, their existing Debezium-based replication approach introduced gaps and slow recovery times, making it difficult to consistently meet SLAs and audit expectations.

Rather than continuing to scale a tightly coupled warehouse-centric model, the team set out to adopt a more flexible lakehouse approach — one that would decouple storage and compute, improve control over data, and support a wider range of downstream use cases.

A shift to an open, decoupled data architecture on S3

To support this shift, Shutterstock standardized on Fivetran Managed Data Lake Service as the foundation for data ingestion, building a lakehouse architecture on Amazon S3 with Snowflake as the compute and query engine.

“We wanted to decouple storage from compute and keep control of our data. With Fivetran loading into S3, we’ve built an Open Data Infrastructure that isn’t tied to any one vendor and can evolve with us over time.”
– Jitesh Kumar, Senior Software Development Manager at Shutterstock

Fivetran initially solved a targeted need: reliably ingesting Google Sheets data into a governed environment, creating a consistent way to incorporate business-critical inputs like budgets and categorization logic. As adoption expanded, Fivetran replaced manual pipelines and Debezium-based replication, eliminating data gaps and improving reliability.

Today, Fivetran connects more than 70 microservices and data sources into an S3-based data lake using Iceberg tables as the foundation (bronze layer). Snowflake serves as the compute engine to query that data, with DBT powering transformations and curated models in the final (gold) layer.

By centralizing raw data in S3, Shutterstock created a foundation that can be accessed by multiple teams and tools, without duplication or locking data into a single system. Data and AI teams can work directly from S3 using AWS-native services, while business teams continue to rely on Snowflake for governed reporting.

From weeks to minutes: Faster delivery, more reliable data

With Fivetran, Shutterstock significantly improved both the speed and reliability of its data platform. The team reduced data integration time from weeks of planning and development to minutes of setup. Pipeline reliability also improved, cutting issue resolution time from weeks to days and eliminating the data gaps that previously impacted SLAs.

More predictable and timely data availability allowed the team to start daily processing earlier and deliver insights faster across the business. This shift is already enabling new use cases, including near real-time visibility into product performance — allowing teams to monitor launches and customer activity within minutes instead of waiting for next-day reports.

Fivetran also strengthened Shutterstock’s ability to support audit and compliance requirements. By centralizing connector logs in Snowflake, the team can query pipeline performance, track failures, and provide a complete audit trail for financial reporting.

Positioned for AI and the next generation of use cases

Shutterstock’s architecture is designed for long-term flexibility, not just immediate gains. By keeping raw data in S3 in an open format, the team has created a foundation that supports multiple tools and processing engines — from Snowflake to AWS-native services and future platforms — without needing to duplicate data or re-architecture.

With near real-time data availability and an open, decoupled architecture, Shutterstock is well positioned to support AI-driven use cases, advanced analytics, and evolving business needs — while maintaining the governance and auditability required for financial reporting.

“By building an AI-ready data foundation with Fivetran, we’re not just improving reporting — we’re enabling the next generation of analytics and AI use cases on top of trusted, real-time data.”
– Jitesh Kumar, Senior Software Development Manager at Shutterstock

[CTA_MODULE]

The total economic impact of Fivetran

Learn how automated data movement boosts productivity and accelerates insights for your business.

Download the report
Centralized data drives enterprise growth

How real Fivetran customers accelerate analytics and AI

Get the guide
Why they chose Fivetran

Further reading
No items found.
No items found.
Related customer stories
Case study

Shutterstock builds AI-ready Open Data Infrastructure with Fivetran

Case study

Copel drives $3.3 million in projected revenue and AI-powered customer solutions

Case study

Canva unifies and activates data to power personalization for 260M+ users

Case study

Lion Corporation automates SAP data integration to power digital transformation

Case study

First Hawaiian Bank unlocks real-time reporting and customer 360

Case study

Bold Penguin cuts support response times by 98% with Fivetran

Case study

Loom improves support response speed by activating real-time customer data

Case study

Activision scales personalized marketing for millions of players

Case study

Zip achieves high-precision targeting through unified customer data

Case study

Grafana Labs delivers rapid ERP migration and customer 360 to unlock growth