You can’t afford a data breach. Here’s how to avoid one.

Automation and hybrid deployment complement each other to provide the utmost security in data integration.
January 7, 2025

This article was first published on Forbes Tech Council on January 6, 2025

Recent security failures have resulted in massive breaches, compromising terabytes of data and hundreds of millions of records. Government bodies and major companies – including entertainment and cybersecurity firms – have all experienced leaks, ransomware, and cyberattacks. Highly regulated organizations like the UK’s NHS, the Indian Council of Medical Research, and the US Consumer Financial Protection Bureau have not been immune. Uber even faced a $324 million fine for failing to sufficiently safeguard the transfer of sensitive data from the EU to the US.

With increased scrutiny from regulations like the EU-U.S. Data Privacy Framework and the American Privacy Rights Act, secure data handling is more critical than ever. Modern enterprises handle a massive scale of data from a wide variety of sources, compounding this problem.

In today’s landscape, companies can’t afford the risk of data breaches. Instead, companies need a systematic, scalable approach to securely manage their data. Investing in secure, automated data integration is the key to reliably and efficiently safeguarding valuable information.

Data pipelines can be the weakest link in your ecosystem

Modern data workflows use data pipelines to move data from applications, operational systems, and other sources to a data warehouse or data lake. Even though they are not meant to store sensitive data, data pipelines must access and handle it to perform backups, data syncs, and other tasks. 

Many organizations build their data pipelines in-house. However, DIY data integration is inherently complicated and engineering-heavy, with a high potential for creating inadvertent security weaknesses. DIY data pipelines create both technological and organizational points of failure. Not only is designing, building, and maintaining a secure data pipeline intrinsically tricky, but analytics and engineering teams also contend with competing priorities and do not specialize in security and governance. 

Such teams may not implement security and governance best practices, leading to serious design and engineering flaws. For example, running all processes in a single server or container for simpler management allows both malicious and accidental exposures to compromise entire tech stacks. Other oversights include the absence of security and governance features, such as the ability to monitor and control access. Breaches are inherently difficult to track; even if the data is not persisted, it may be accidentally exposed or replicated in transit. Pipelines may also break down, leading to the loss of critical data that is difficult or impossible to recover.

These issues all grow with the volume and variety of data an organization handles. Security is a highly specialized field in its own right, and public-facing systems should be validated through audits, penetration tests, and design reviews.

Organizations, particularly in highly regulated industries like government, defense, healthcare, and finance, try to circumvent this problem by maintaining data on-premises or in private clouds for additional security. Yet, recent breaches demonstrate that this approach is far from foolproof. 

How to leverage a secure, automated solution for data integration

Automated data integration offers a technological solution to both labor scarcity and the challenges of security and governance for data in transit, addressing the vulnerabilities of DIY data pipelines. 

Traditionally, data teams could not automate on-premises data integration because the system hosted data in a proprietary environment that outside parties and tools could not (and should not) access. Even with the assistance of commercial tools and technologies, data teams were forced to assume direct responsibility for building and maintaining pipelines as well as securing and governing data. On-premises data integration consumed engineering hours and created openings for engineering oversights to expose sensitive data while limiting opportunities to scale both the volume and variety of data.

To address these limitations, some organizations are turning to architectures that separate the mechanics of data movement from the systems that control and manage integration workflows. This involves distinguishing between the data plane, where information is transferred, and the control plane, which governs processes without directly handling the data itself.

We call this approach hybrid deployment, a model designed to allow secure automation of on-premises data integration. By maintaining this separation, sensitive data—such as personally identifiable information—stays within its originating environment, even as workflows are managed remotely. This enables organizations to centralize control while sidestepping many of the risks associated with DIY pipelines.

As an emerging technology, hybrid deployments are still developing in their breadth and complexity of use cases. For example, some solutions might not yet support all of the data sources or destinations that legacy on-premises solutions solve. There are also very niche cases where companies cannot even let operational metadata leave their environment, requiring them to only use fully self-hosted solutions.

However, for organizations obligated to maintain data on-premises for security, governance, and compliance, automated data integration with hybrid deployment presents an opportunity to not only provide access to important on-premises data from sensitive operations but also unify it with less sensitive data from cloud-based applications and other sources. This ability to comprehensively centralize and access data is key to enabling analytics of all kinds, from reporting and predictive modeling to AI.

[CTA_MODULE]

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

You can’t afford a data breach. Here’s how to avoid one.

You can’t afford a data breach. Here’s how to avoid one.

January 7, 2025
January 7, 2025
You can’t afford a data breach. Here’s how to avoid one.
Automation and hybrid deployment complement each other to provide the utmost security in data integration.

This article was first published on Forbes Tech Council on January 6, 2025

Recent security failures have resulted in massive breaches, compromising terabytes of data and hundreds of millions of records. Government bodies and major companies – including entertainment and cybersecurity firms – have all experienced leaks, ransomware, and cyberattacks. Highly regulated organizations like the UK’s NHS, the Indian Council of Medical Research, and the US Consumer Financial Protection Bureau have not been immune. Uber even faced a $324 million fine for failing to sufficiently safeguard the transfer of sensitive data from the EU to the US.

With increased scrutiny from regulations like the EU-U.S. Data Privacy Framework and the American Privacy Rights Act, secure data handling is more critical than ever. Modern enterprises handle a massive scale of data from a wide variety of sources, compounding this problem.

In today’s landscape, companies can’t afford the risk of data breaches. Instead, companies need a systematic, scalable approach to securely manage their data. Investing in secure, automated data integration is the key to reliably and efficiently safeguarding valuable information.

Data pipelines can be the weakest link in your ecosystem

Modern data workflows use data pipelines to move data from applications, operational systems, and other sources to a data warehouse or data lake. Even though they are not meant to store sensitive data, data pipelines must access and handle it to perform backups, data syncs, and other tasks. 

Many organizations build their data pipelines in-house. However, DIY data integration is inherently complicated and engineering-heavy, with a high potential for creating inadvertent security weaknesses. DIY data pipelines create both technological and organizational points of failure. Not only is designing, building, and maintaining a secure data pipeline intrinsically tricky, but analytics and engineering teams also contend with competing priorities and do not specialize in security and governance. 

Such teams may not implement security and governance best practices, leading to serious design and engineering flaws. For example, running all processes in a single server or container for simpler management allows both malicious and accidental exposures to compromise entire tech stacks. Other oversights include the absence of security and governance features, such as the ability to monitor and control access. Breaches are inherently difficult to track; even if the data is not persisted, it may be accidentally exposed or replicated in transit. Pipelines may also break down, leading to the loss of critical data that is difficult or impossible to recover.

These issues all grow with the volume and variety of data an organization handles. Security is a highly specialized field in its own right, and public-facing systems should be validated through audits, penetration tests, and design reviews.

Organizations, particularly in highly regulated industries like government, defense, healthcare, and finance, try to circumvent this problem by maintaining data on-premises or in private clouds for additional security. Yet, recent breaches demonstrate that this approach is far from foolproof. 

How to leverage a secure, automated solution for data integration

Automated data integration offers a technological solution to both labor scarcity and the challenges of security and governance for data in transit, addressing the vulnerabilities of DIY data pipelines. 

Traditionally, data teams could not automate on-premises data integration because the system hosted data in a proprietary environment that outside parties and tools could not (and should not) access. Even with the assistance of commercial tools and technologies, data teams were forced to assume direct responsibility for building and maintaining pipelines as well as securing and governing data. On-premises data integration consumed engineering hours and created openings for engineering oversights to expose sensitive data while limiting opportunities to scale both the volume and variety of data.

To address these limitations, some organizations are turning to architectures that separate the mechanics of data movement from the systems that control and manage integration workflows. This involves distinguishing between the data plane, where information is transferred, and the control plane, which governs processes without directly handling the data itself.

We call this approach hybrid deployment, a model designed to allow secure automation of on-premises data integration. By maintaining this separation, sensitive data—such as personally identifiable information—stays within its originating environment, even as workflows are managed remotely. This enables organizations to centralize control while sidestepping many of the risks associated with DIY pipelines.

As an emerging technology, hybrid deployments are still developing in their breadth and complexity of use cases. For example, some solutions might not yet support all of the data sources or destinations that legacy on-premises solutions solve. There are also very niche cases where companies cannot even let operational metadata leave their environment, requiring them to only use fully self-hosted solutions.

However, for organizations obligated to maintain data on-premises for security, governance, and compliance, automated data integration with hybrid deployment presents an opportunity to not only provide access to important on-premises data from sensitive operations but also unify it with less sensitive data from cloud-based applications and other sources. This ability to comprehensively centralize and access data is key to enabling analytics of all kinds, from reporting and predictive modeling to AI.

[CTA_MODULE]

Experience the security capabilities of Fivetran for yourself.
Start now
Topics
Share

Articles associés

Maximize control and security with Fivetran's Hybrid Deployment
Product

Maximize control and security with Fivetran's Hybrid Deployment

Lire l’article
How to manage data integration with hybrid cloud
Data insights

How to manage data integration with hybrid cloud

Lire l’article
The importance of ETL security
Data insights

The importance of ETL security

Lire l’article
How Fivetran ensures GDPR compliance and protects your data
Blog

How Fivetran ensures GDPR compliance and protects your data

Lire l’article
Implementing a data fabric: From silos to insights
Blog

Implementing a data fabric: From silos to insights

Lire l’article
Leaky data pipelines: Uncovering the hidden security risks
Blog

Leaky data pipelines: Uncovering the hidden security risks

Lire l’article
3 enterprises driving innovation with modern database replication
Blog

3 enterprises driving innovation with modern database replication

Lire l’article
How Fivetran ensures GDPR compliance and protects your data
Blog

How Fivetran ensures GDPR compliance and protects your data

Lire l’article
How Fivetran helps power BlackRock’s Aladdin® Data Cloud
Blog

How Fivetran helps power BlackRock’s Aladdin® Data Cloud

Lire l’article

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.