How to manage data integration with hybrid cloud

Bi-directional data movement and low latency are key to effectively integrating data between on-prem and cloud for real-time analysis.
July 5, 2022

Given the cloud’s central role in the enterprise, IDC forecasts cloud spending will surpass $1.3 trillion by 2025. One reason for the growth: Organizations often adopt cloud technologies for analytic use cases

The cloud enables access to robust analytics services at scale using a pay-for-use scheme. Importantly, companies can avoid significant upfront investments. Instead of building a configuration in a data center, you can experiment with different options available to you — and quickly scale up or down with the cloud provider’s consumption-based model. These are desirable attributes for analytical environments that require scalability and can be very expensive when scaled for maximum capacity over an extended period.

An organization’s primary business process is generally supported by one or more operational systems. Because these environments are crucial, businesses must have optimum access and continuous availability.

Cloud migrations of operational systems require considerable testing to ensure that the new environment has a similar or better performance than the existing, often on-prem, systems. These systems must have the same or higher levels of availability, and data must be secure. Environments also have to perform well under heavy user load. Integrations must be rewritten to work with the new system, too. Depending on the criticality of the system, organizations may also need a fallback for some time after the initial migration.

The following are best practices that will ensure a smooth data integration that effectively delivers high performance, security and availability across all heterogeneous data sources. Download the ebook to learn more

Free ebook: 6 best practices for cloud integration


1. Learn the impact on operations 

Organizations need the data in operational systems that drive the business for consolidated analytical environments. Therefore, it’s important to find a solution that captures the changes going into your operational system with minimal overhead.

For many self-hosted applications, consider a database-level change data capture (CDC) solution, of which log-based CDC is widely considered the least intrusive. Because critical systems contain the most important data to help drive decisions, real-time access to this data is required to be more competitive. Log-based CDC handles the highest volumes of change data in real-time — enabling organizations to make informed, data-driven decisions more quickly. 

Depending on where you are in your hybrid cloud adoption, you may consider alternatives for your current deployment model. For example, if you currently host your organization's ERP in an on-prem data center, you may consider cloud-hosted or software-as-a-service options. Or you may need the primary system to synchronize with the new or old configuration during the migration to allow for testing and provide a fallback option.

2. Consider bi-directional data movement

You may have started your hybrid cloud journey with one or more analytical use cases. Any migration is daunting, especially one that affects your organization’s primary business process. What’s going to happen when all users switch to the new environment? And, if things don’t work out, what’s your fallback option?

Consider bi-directional data movement. You probably don’t need active/active replication because most applications are not prepared to run in active/active mode. However, running active/passive replication is a powerful way to mitigate data loss if a fallback is required. Instead of asking users to redo their work or re-run routines that were processed already, you replicate the data to the source.

If the migration is not successful, then you switch back. Data processing continues with minimal disruption. How long you keep the old system around is a risk assessment. Some organizations want to see initial successes to feel comfortable about not needing the old system. Others want to see at least a couple of months of successful processing before giving up the fallback option.

3. Low latency is a must-have during migration

Business requirements will determine the maximum allowable latency. Cloud-based environments are built to be available 24x7, and users (as well as customers) have become accustomed to instant access to information. These combined factors drive organizations to look for near real-time or continuous data integration solutions.

Consider the competitive differentiation you can achieve with consolidated data available for analytics closer to real-time. A solution you sell to your customers may become more valuable. Your team may become better equipped to identify fraudulent behavior. You may have opportunities to save costs simply by reacting more quickly.

During the data migration, low latency is a must. If a critical operational system does not meet expectations post-migration, you want to lose no time and resume processing on the old environment. However, it must be up to date with the latest changes.

How to continually integrate data between on-prem and cloud for real-time analysis

Whether data arrives from a SaaS platform or directly from a database, change data capture technology can enable near real-time updates to the analytical environment. Log-based CDC — reading changes from a database transaction log — is widely considered the least-intrusive method to retrieve database changes.

Fivetran offers CDC for most of our connectors to applications — and all connectors to databases. After the initial sync of your historical data, Fivetran performs incremental updates of any new or modified data from your source system. 

During incremental syncs, Fivetran maintains an internal set of progress cursors that allow us to track the exact point where our last successful sync left off. If there is an interruption in your service (such as your destination going down), we automatically resume syncing where it was left off — even hours or days later, as long as log data is still present. You can also track deletions to view your archived records.

To learn more about our approach to cloud data integration, sign up for a 14-day free trial and test drive our system.

Free ebook: 6 best practices for cloud integration


Start for free

Join the thousands of companies using Fivetran to centralize and transform their data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.