7 data and AI predictions for 2025

Discover expert insights to guide your data and AI strategies for 2025 and beyond.
January 24, 2025

2025 is poised to be a pivotal year as organizations worldwide position generative AI as a top strategic priority, with 83% of CDOs and data leaders making it a key focus. This growing emphasis on AI mirrors broader data trends: Data modernization was the top area of investment in 2024, and 82% of organizations with advanced data and analytics maturity saw consistent year-over-year revenue growth.

In this article, 6 experts from Fivetran, Databricks, Snowflake, dbt Labs, and Hakkoda share their insights into the key trends and predictions for data and AI in 2025.

1. Data lakes and modern data tools will define the future of AI

Taylor Brown, COO and co-founder of Fivetran, predicts 2 major shifts in 2025: the rise of data lakes as the backbone of modern infrastructure and the pivotal role of tools like Fivetran in advancing AI innovation.

Brown likens the current trajectory of data lakes to the early days of data warehouses in 2012, a period marked by rapid growth and investment. Data lakes are set to become the cornerstone of data infrastructure, offering businesses scalability, flexibility, and a strong foundation for managing and leveraging data.

“Advances in cloud technology over the past decade have unlocked new opportunities, especially with modern data warehouses and lakes, which are 100x better than on-premise systems. The key challenge now is ensuring scalable, automated data integration for companies managing thousands of locations across multiple geographies. Manual processes can’t keep up, and automated, managed tools like Fivetran are essential for applying proper governance, security, and profiling during data movement.”
—Taylor Brown, COO and co-founder of Fivetran

2. Open data lakes will become the standard

Anjan Kundavaram, Chief Product Officer at Fivetran, foresees that open data lakes will take center stage in 2025 enterprise strategies, driving cost efficiency and enabling innovation across industries. With open table formats like Iceberg and Delta gaining traction, Kundavaram anticipates a surge in specialized query engines designed for specific data needs, enabling more efficient analysis of large data sets.

Lakehouse architectures, which combine a data format and independent storage layer, will allow businesses to experiment with different query engines and reduce vendor lock-in.

With key industry players also advancing this shift, Kundavaram believes open data formats will provide businesses with greater flexibility and enhanced value from their data.

The entire data ecosystem is moving toward being more open. Customers want to get more value from their analytics investments — and open data formats and architectures like lakehouses are making that possible. It’s exciting to see Snowflake and Databricks demonstrating that open format innovation with Polaris and Unity. At Fivetran, we’re going to play a huge role in this paradigm shift, helping customers explore new engines and run analytics in a more cost-effective way.”
— Anjan Kundavaram, Chief Product Officer at Fivetran

[CTA_MODULE]

3. Knowledge graphs will unlock valuable insights from unstructured data

As interest in data shifts from traditional systems to less structured sources like emails, chat conversations, and shared drives, Moin Haque, Head of Enterprise Data, Analytics & AI at IFF, projects that organizations will begin recognizing knowledge graphs as critical sources of insight.

A knowledge graph is a digital representation of human-created knowledge that can be understood by a machine. Unlike traditional databases, which store data in rigid tables, knowledge graphs organize information into nodes (entities) and edges (relationships), creating a web of connections that mimic how humans naturally link ideas.

According to Haque, knowledge graphs can help provide businesses with a more holistic view of their operations, customers, and opportunities, making it easier for organizations to uncover hidden patterns and navigate complex data in a more natural and intuitive way.

“Unstructured data sources that were never designed for discovery, governance, or metadata hold the signals we need we need to drive real value. Personally, I welcome anything that moves us away from reports and dashboards — they’re one of the least efficient ways to drive insights.”
— Moin Haque, Head of Enterprise Data, Analytics & AI at IFF

4. Governance will enable AI accuracy and reliability

Until now, discussions around AI governance have largely focused on security and regulation. In 2025, however, Trâm Phi, General Counsel at Databricks, highlights a shift in priorities; more executives are recognizing the critical link between data governance and AI accuracy and reliability.

As AI systems become increasingly complex, adopting a more holistic governance strategy will be key to promoting responsible AI, all while minimizing risks and ensuring compliance. According to Phi, improving oversight in this way will empower businesses to scale AI faster and realize its full potential.

“As more businesses embrace data intelligence, leaders need to think critically about how to balance widespread access with privacy, security, and cost concerns. The right end-to-end governance framework will allow companies to more easily monitor access, usage and risk, and uncover ways to improve efficiency and cut costs, giving enterprises the confidence to invest even more in their AI strategies.”
— Trâm Phi, General Counsel at Databricks

5. RAG technology will solve GenAI’s accuracy challenges

In 2025, the evolving landscape of GenAI will focus on mitigating risks like "hallucinations," or AI-generated false statements, which remain a significant challenge for deploying AI at scale. Benoit Dageville, Co-Founder and President of Product at Snowflake, predicts that advancements in guardrails and retrieval-augmented generation (RAG) technology will play a crucial role in addressing these issues.

Guardrails will limit what GenAI can say, ensuring that outputs are not only factually accurate but also unbiased and aligned with the intended tone. This will increase trust in AI-generated content, particularly for enterprises planning to use AI in external customer-facing applications.

In addition to guardrails, Dageville emphasizes that RAG will be key to solving accuracy challenges. By augmenting language generation with information retrieval, RAG ensures that AI outputs are grounded in real, factual information, reducing the likelihood of hallucinations.

“In the enterprise, you want to ground generative AI with real facts. RAG can analyze structured and unstructured data and give you a summary that is more human, more accurate and more transparent.”
— Benoit Dageville, Co-Founder and President of Product at Snowflake

6. Utility compute is the future of specialized workloads

Tristan Handy, Founder and CEO of dbt Labs, predicts the rise of utility compute — purpose-built engines designed to handle specific workloads with exceptional efficiency. This approach allows vendors to optimize costs and performance by tailoring solutions to unique use cases.

Handy highlights Fivetran's recent innovation — internalizing ingestion costs using the DuckDB query engine — as a prime example of utility compute, showcasing how it can reduce costs, improve efficiency, and provide a new model for scaling operations. 

Handy believes that more companies will adopt specialty implementations of query engines for specific performance, usability, or scale parameters in 2025, unlocking new opportunities for cost-effective, workload-specific optimization.

“When you build a specialized engine for a specific workload, you can be more efficient, because you can rely on special characteristics of your workload. For example, when Fivetran built our data lake writer service, we were able to make it so efficient that we can simply absorb the ingest cost as part of our existing pricing model. Ingest is free for Fivetran data lake users.”
— Tristan Handy, Founder and CEO of dbt Labs

7. Centralized data will enable smarter AI in supply chain operations

Rico Mawcinitt, Global Head of Supply Chain and Logistics at Hakkoda, views data fragmentation as 2025’s most significant challenge hindering AI's potential in supply chain operations. As companies rely on a mix of tools serving different regions, customers, and functions, the resulting silos make it difficult to create a unified, cohesive view — preventing AI from delivering meaningful insights and scaling automation across the business.

To tackle this challenge, Hakkoda has focused on building centralized data ecosystems that break down these silos and enable seamless integration of AI technologies. In doing so, they create the foundation necessary for AI to drive smarter, faster decisions and deliver consistent value across the supply chain.

“At Hakkoda, one of the things we’re most excited about is implementing AI-driven automation to improve our operations like inventory management, order processing, and route optimization. By getting our data foundation right, we can also open new doors like creating smarter virtual agents that handle customer inquiries, adjust schedules, or make real-time decisions based on changes in supply and demand.”
Rico Mawcinitt, Global Head of Supply Chain and Logistics at Hakkoda

In 2025, look for ways to make your AI journey easier

Each of these predictions highlights the growing importance of certain tools and capabilities as GenAI continues to enter the mainstream. Note that none of these predictions specifically indicate the development of new tools — rather all the pieces for a working AI architecture already exist, but need to be thoughtfully combined to ensure effective and responsible AI deployments.

In the coming year, focus less on building from scratch and more on identifying tools and best practices that your team can easily combine, assemble, and evaluate. The bulk of your engineering talent should be devoted to the last mile of AI development – testing and iterating through the responses produced by a model, indexing and curating data, and tuning prompts and retrieval methods.  

[CTA_MODULE]

Kostenlos starten

Schließen auch Sie sich den Tausenden von Unternehmen an, die ihre Daten mithilfe von Fivetran zentralisieren und transformieren.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Data insights
Data insights

7 data and AI predictions for 2025

7 data and AI predictions for 2025

January 24, 2025
January 24, 2025
7 data and AI predictions for 2025
Discover expert insights to guide your data and AI strategies for 2025 and beyond.

2025 is poised to be a pivotal year as organizations worldwide position generative AI as a top strategic priority, with 83% of CDOs and data leaders making it a key focus. This growing emphasis on AI mirrors broader data trends: Data modernization was the top area of investment in 2024, and 82% of organizations with advanced data and analytics maturity saw consistent year-over-year revenue growth.

In this article, 6 experts from Fivetran, Databricks, Snowflake, dbt Labs, and Hakkoda share their insights into the key trends and predictions for data and AI in 2025.

1. Data lakes and modern data tools will define the future of AI

Taylor Brown, COO and co-founder of Fivetran, predicts 2 major shifts in 2025: the rise of data lakes as the backbone of modern infrastructure and the pivotal role of tools like Fivetran in advancing AI innovation.

Brown likens the current trajectory of data lakes to the early days of data warehouses in 2012, a period marked by rapid growth and investment. Data lakes are set to become the cornerstone of data infrastructure, offering businesses scalability, flexibility, and a strong foundation for managing and leveraging data.

“Advances in cloud technology over the past decade have unlocked new opportunities, especially with modern data warehouses and lakes, which are 100x better than on-premise systems. The key challenge now is ensuring scalable, automated data integration for companies managing thousands of locations across multiple geographies. Manual processes can’t keep up, and automated, managed tools like Fivetran are essential for applying proper governance, security, and profiling during data movement.”
—Taylor Brown, COO and co-founder of Fivetran

2. Open data lakes will become the standard

Anjan Kundavaram, Chief Product Officer at Fivetran, foresees that open data lakes will take center stage in 2025 enterprise strategies, driving cost efficiency and enabling innovation across industries. With open table formats like Iceberg and Delta gaining traction, Kundavaram anticipates a surge in specialized query engines designed for specific data needs, enabling more efficient analysis of large data sets.

Lakehouse architectures, which combine a data format and independent storage layer, will allow businesses to experiment with different query engines and reduce vendor lock-in.

With key industry players also advancing this shift, Kundavaram believes open data formats will provide businesses with greater flexibility and enhanced value from their data.

The entire data ecosystem is moving toward being more open. Customers want to get more value from their analytics investments — and open data formats and architectures like lakehouses are making that possible. It’s exciting to see Snowflake and Databricks demonstrating that open format innovation with Polaris and Unity. At Fivetran, we’re going to play a huge role in this paradigm shift, helping customers explore new engines and run analytics in a more cost-effective way.”
— Anjan Kundavaram, Chief Product Officer at Fivetran

[CTA_MODULE]

3. Knowledge graphs will unlock valuable insights from unstructured data

As interest in data shifts from traditional systems to less structured sources like emails, chat conversations, and shared drives, Moin Haque, Head of Enterprise Data, Analytics & AI at IFF, projects that organizations will begin recognizing knowledge graphs as critical sources of insight.

A knowledge graph is a digital representation of human-created knowledge that can be understood by a machine. Unlike traditional databases, which store data in rigid tables, knowledge graphs organize information into nodes (entities) and edges (relationships), creating a web of connections that mimic how humans naturally link ideas.

According to Haque, knowledge graphs can help provide businesses with a more holistic view of their operations, customers, and opportunities, making it easier for organizations to uncover hidden patterns and navigate complex data in a more natural and intuitive way.

“Unstructured data sources that were never designed for discovery, governance, or metadata hold the signals we need we need to drive real value. Personally, I welcome anything that moves us away from reports and dashboards — they’re one of the least efficient ways to drive insights.”
— Moin Haque, Head of Enterprise Data, Analytics & AI at IFF

4. Governance will enable AI accuracy and reliability

Until now, discussions around AI governance have largely focused on security and regulation. In 2025, however, Trâm Phi, General Counsel at Databricks, highlights a shift in priorities; more executives are recognizing the critical link between data governance and AI accuracy and reliability.

As AI systems become increasingly complex, adopting a more holistic governance strategy will be key to promoting responsible AI, all while minimizing risks and ensuring compliance. According to Phi, improving oversight in this way will empower businesses to scale AI faster and realize its full potential.

“As more businesses embrace data intelligence, leaders need to think critically about how to balance widespread access with privacy, security, and cost concerns. The right end-to-end governance framework will allow companies to more easily monitor access, usage and risk, and uncover ways to improve efficiency and cut costs, giving enterprises the confidence to invest even more in their AI strategies.”
— Trâm Phi, General Counsel at Databricks

5. RAG technology will solve GenAI’s accuracy challenges

In 2025, the evolving landscape of GenAI will focus on mitigating risks like "hallucinations," or AI-generated false statements, which remain a significant challenge for deploying AI at scale. Benoit Dageville, Co-Founder and President of Product at Snowflake, predicts that advancements in guardrails and retrieval-augmented generation (RAG) technology will play a crucial role in addressing these issues.

Guardrails will limit what GenAI can say, ensuring that outputs are not only factually accurate but also unbiased and aligned with the intended tone. This will increase trust in AI-generated content, particularly for enterprises planning to use AI in external customer-facing applications.

In addition to guardrails, Dageville emphasizes that RAG will be key to solving accuracy challenges. By augmenting language generation with information retrieval, RAG ensures that AI outputs are grounded in real, factual information, reducing the likelihood of hallucinations.

“In the enterprise, you want to ground generative AI with real facts. RAG can analyze structured and unstructured data and give you a summary that is more human, more accurate and more transparent.”
— Benoit Dageville, Co-Founder and President of Product at Snowflake

6. Utility compute is the future of specialized workloads

Tristan Handy, Founder and CEO of dbt Labs, predicts the rise of utility compute — purpose-built engines designed to handle specific workloads with exceptional efficiency. This approach allows vendors to optimize costs and performance by tailoring solutions to unique use cases.

Handy highlights Fivetran's recent innovation — internalizing ingestion costs using the DuckDB query engine — as a prime example of utility compute, showcasing how it can reduce costs, improve efficiency, and provide a new model for scaling operations. 

Handy believes that more companies will adopt specialty implementations of query engines for specific performance, usability, or scale parameters in 2025, unlocking new opportunities for cost-effective, workload-specific optimization.

“When you build a specialized engine for a specific workload, you can be more efficient, because you can rely on special characteristics of your workload. For example, when Fivetran built our data lake writer service, we were able to make it so efficient that we can simply absorb the ingest cost as part of our existing pricing model. Ingest is free for Fivetran data lake users.”
— Tristan Handy, Founder and CEO of dbt Labs

7. Centralized data will enable smarter AI in supply chain operations

Rico Mawcinitt, Global Head of Supply Chain and Logistics at Hakkoda, views data fragmentation as 2025’s most significant challenge hindering AI's potential in supply chain operations. As companies rely on a mix of tools serving different regions, customers, and functions, the resulting silos make it difficult to create a unified, cohesive view — preventing AI from delivering meaningful insights and scaling automation across the business.

To tackle this challenge, Hakkoda has focused on building centralized data ecosystems that break down these silos and enable seamless integration of AI technologies. In doing so, they create the foundation necessary for AI to drive smarter, faster decisions and deliver consistent value across the supply chain.

“At Hakkoda, one of the things we’re most excited about is implementing AI-driven automation to improve our operations like inventory management, order processing, and route optimization. By getting our data foundation right, we can also open new doors like creating smarter virtual agents that handle customer inquiries, adjust schedules, or make real-time decisions based on changes in supply and demand.”
Rico Mawcinitt, Global Head of Supply Chain and Logistics at Hakkoda

In 2025, look for ways to make your AI journey easier

Each of these predictions highlights the growing importance of certain tools and capabilities as GenAI continues to enter the mainstream. Note that none of these predictions specifically indicate the development of new tools — rather all the pieces for a working AI architecture already exist, but need to be thoughtfully combined to ensure effective and responsible AI deployments.

In the coming year, focus less on building from scratch and more on identifying tools and best practices that your team can easily combine, assemble, and evaluate. The bulk of your engineering talent should be devoted to the last mile of AI development – testing and iterating through the responses produced by a model, indexing and curating data, and tuning prompts and retrieval methods.  

[CTA_MODULE]

Discover expert insights on staying ahead in the future of data and AI innovation.
Listen to the podcast
Jumpstart your AI journey with this guide to data lake management for GenAI.
Download the ebook

Verwandte Beiträge

AI readiness requires a unified data architecture
Data insights

AI readiness requires a unified data architecture

Beitrag lesen
How to build a data foundation for generative AI
Data insights

How to build a data foundation for generative AI

Beitrag lesen
No items found.
Scale secure data movement with Hybrid Deployment and Kubernetes
Blog

Scale secure data movement with Hybrid Deployment and Kubernetes

Beitrag lesen
Hybrid solutions to modernize legacy systems in financial services
Blog

Hybrid solutions to modernize legacy systems in financial services

Beitrag lesen
Fivetran Product Update: February 2025
Blog

Fivetran Product Update: February 2025

Beitrag lesen

Kostenlos starten

Schließen auch Sie sich den Tausenden von Unternehmen an, die ihre Daten mithilfe von Fivetran zentralisieren und transformieren.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.