Real-time benchmarking for database replication
Look under the hood at the live performance benchmarking data for Fivetran data pipelines. With fast throughput and low latency, Fivetran can replicate from large database volumes quickly and efficiently.
High-performance database replication
We built this benchmark to showcase our performance loading from common OLTP relational databases like Oracle, PostgreSQL, SQL Server, and MySQL. We are using TPROC-C to benchmark and highlight our ability to ingest data from heavily loaded relational databases. The below show our benchmarking results for loading Oracle data to Snowflake.
Consistent high volume throughput across historical syncs
Fivetran efficiently handles the historical sync of large data volumes with a throughput greater than 500 GB/hour. For the best performance and to ensure a database can release additional transactional data while an import is running, Fivetran breaks large tables into consumable chunks, reducing the duration of our transaction during a historical sync. The data below shows Fivetran consistently replicates the data at high throughput levels, saving our users time and ensuring their data is readily available for downstream workflows.
To understand the impacts of a historical data sync. Fivetran runs a performance benchmark roughly once per week to see the throughput performance of replicating data from Oracle to Snowflake. The throughput values are calculated with the following formula: Throughput (GB/hr) = Data Volume (GB) / Historical Sync Time (hr)
Throughput (GB/hr) = Data Volume (GB) / Historical Sync Time (hr)
The freshest data, always available
To better understand our syncs, Fivetran captures the total time for each sync during a period of high load on the Oracle database. With 16,000+ transactions written to the database per second, Fivetran incrementally replicates each change to Snowflake for better performance. When one incremental sync finishes, the next incremental sync begins to ensure the data pipeline doesn’t fall behind and users always have access to the freshest data.
.png)
Fivetran supports 1 minute sync frequencies. When a sync takes longer than the set sync frequency, the next sync automatically kicks off, ensuring that the data in the destination is always up-to-date.
High-performance pipelines for high-volume workloads
With even some of the largest volumes of change data that customers replicate (greater than 16,000 transactions per second), Fivetran keeps up with incremental syncs in near real-time. Fivetran uses change data capture replication to incrementally update large volumes of data. The benchmark measures the latency and throughput of incremental syncs to highlight how performant Fivetran is even under intense load – 250+ GB/hr throughput and 15min or less latency. Many enterprises desire a 30 minute to 1 hour replication SLA and Fivetran successfully meets these needs when replicating transactional data into data warehouses or data lakes.
This workload represents the largest real-world databases with HammerDB. We see similar fluctuations in the benchmarking data volumes of a real-world database. Understand our benchmark tests further, here.
Over the course of 2 hours, 16,000+ new records are created per second on the Oracle database for Fivetran to replicate to Snowflake. Given the high volume of change data, Fivetran measures the latency and throughput to ensure all changed data is written to the destination in a timely manner. The above graph shows latency percentiles at P50, P80, and P95 confidence levels for full transparency ensuring we are quickly and efficiently replicating all changes regardless of the data volumes.
Looking at Fivetran by the numbers
Historical sync throughput speeds
Amount of data synced per month
Schema changes handled per month
Transformation models run per month
Pipeline syncs per month
Rows synced per month