Hybrid Deployment, AI & data: Fivetran’s take on the future of innovation
Hybrid Deployment, AI & data: Fivetran’s take on the future of innovation
Fivetran’s COO and CPO explore key trends in data management, discuss the launch of Hybrid Deployment and share predictions for AI and data strategy in 2025.
Fivetran’s COO and CPO explore key trends in data management, discuss the launch of Hybrid Deployment and share predictions for AI and data strategy in 2025.
More about the episode
Fivetran’s Chief Operating Officer Taylor Brown and Chief Product Officer Anjan Kundavaram explore the future of data innovation in a fireside chat. In this discussion, Taylor and Anjan share their insights on the rapidly changing data landscape and the role of Fivetran's Hybrid Deployment in addressing today’s security and scalability challenges.
As data silos, AI adoption and privacy regulations evolve, businesses need solutions for seamless data management. Fivetran’s Hybrid Deployment offers a unique blend of cloud convenience and on-premise security, making it possible to move data securely without leaving company networks — a crucial advantage for industries like finance and healthcare.
Their conversation also dives into top trends, from open data formats to the transformative impact of AI and machine learning. Discover how enterprises can unlock data’s potential by treating AI as an extension of their data strategy, enabling everything from real-time analytics to personalized customer insights. Taylor and Anjan provide a roadmap for navigating these advancements and discuss Fivetran’s vision for helping enterprises stay ahead.
Key takeaways:
- How Hybrid Deployment bridges the gap between security and scalability.
- Top trends in data management, including open data formats and AI.
- Predictions for the future of AI and data infrastructure innovation.
Watch the episode
Transcript
Jonathan Lincheck (00:00)
Hello. I'm Jonathan Lincheck, the Global Vice-President of Sales engineering at Fivetran, and your host for today's fireside chat. I'm excited to be joined by Fivetran's co-founder and COO, Taylor Brown, and Chief Product Officer, Anjan Kundavaram. And they'll share their insights today on everything data. They'll talk about the evolving data landscape and the need for modern solutions to security challenges, and even share a little bit about Fivetran's newest innovation, Hybrid Deployment. They'll highlight key industry trends and offer their insights and predictions for the future. Gentlemen.
Anjan Kundavaram (0:28)
Hey, Jonathan.
Taylor Brown (0:29)
Jonathan, good to be here.
Jonathan Lincheck (00:31)
Anjan, what do you think are the most significant challenges facing enterprises today when it comes to data management and integration?
Anjan Kundavaram (0:37)
You know, as I talk to customers, Jonathan, the thing that keeps coming up, what we've heard for a while, is data silos. Data is still in spreadsheets, mainframes, databases. The rate at which customers are adopting SaaS applications continues to increase. And if you don't kind of bring all that information, that's a massive disadvantage. And if you think about it, if you're asking a business question and you want to kind of get an industry analytic, it's about context. And the context could be in a spreadsheet, it could be in a database, it could be in an application. And if you don't have the right context, you're not getting the right answer.
The second dimension to this is customers have sensitive data. It could be customer data, it could be financial data. And if you're not bringing that to make business decisions, that's going to impact you. That's a disadvantage again, and you see that all the time. There's a strong correlation between that kind of data and business impact.
And then finally, we've all seen the emergence of gen AI and LLMs. LLMs can consume this information in a way that analytics couldn't. So if you're a CEO or a data leader, and you want a data advantage, fix the data silo problem.
Jonathan Lincheck (1:50)
That's great. And Taylor, we know when we read the newspapers every day, we see things about cyber threats. We see new regulations that are being put out. How do you think that is affecting how enterprises deal with these data strategy challenges?
Taylor Brown (2:03)
Yeah, it's a good question. There's a mountain of challenges that I think customers are facing every single day with their data. And it starts with there's a lot more data created than ever before, and then there's all these silos and different applications that customers have across the organization. Then there's a bunch of regulations around GDPR, CCPA. On top of that, you have a global workforce that's decentralized, so a lot of people are not in office. And then you have a lot of governance challenges over who has access to what, by when. And all of these things add up for any company who's trying to do anything with their data. And then you add the additional pressure of AI on top of that.
So every company is now facing this need to say, "I need to get the most out of my data. I need to have really great AI strategies," and the existing toolset or the previous sort of toolset just can't stand up to the challenges. Customers are unable to take their legacy systems and actually modify them appropriately to get the most out of their data, and so they have to find more modern approaches for this.
Jonathan Lincheck (3:12)
How do you ensure that your systems are secure and compliant while being able to scale for the large amounts of data necessary for AI and ML solutions?
Taylor Brown (3:22)
The advances that have taken place in technology with largely the cloud over the last 10 years has created a whole lot of new opportunity, particularly on the data warehouses and the data lakes and data platforms. These are 100 times better than the existing on-premise systems. And so the first and most important thing is for customers to be getting the data into one of these cloud data platforms. Now, the second challenge is how they're going to get it into these different systems. It has to be scalable. It has to be more automated because a lot of these companies have thousands of different locations across multiple different geos that they need to get data in from.
And if they're trying to do this in a manual way, there's just no way they're actually going to be able to keep up and then apply the right governance, apply the right security profiling. All of these other pieces have to get applied at the time the data's being moved, and so you have to pick tools like Fivetran that have a very automated approach and a managed approach to this.
So also, your AI strategy is just a continuation of your data strategy. A lot of companies end up thinking they need to have a data strategy, and then they need to have an AI strategy. At the end of the day, AI just runs on top of the data that you have, and so it's a pitfall for companies to think they need a completely separate strategy. They should have one strategy.
And when they do that, they get the best out of the data for their business intelligence, then they can build AI on top of that. An example of this would be Saks. We started working with Saks about two years ago. We helped them modernize their entire data strategy for business intelligence. And then when the AI wave happened, they started building a lot of AI applications on top of that. And as you may have seen in the last six months, their growth has been fantastic. And so I think this is how companies can really leverage their AI strategy and data strategy.
Anjan Kundavaram (5:08)
You said something, Taylor, about customers trying to do this manually. It always seems easy when you do the first connector. But the challenge is when you're trying to make that connector 99.9% uptime reliable in the throughput to the customer's care, and the thing they often forget is the maintenance. Data keeps changing. Your business really wants up-to-date information. And so you're like, "Are you going to keep these data pipelines up and running and updating? Or are you going to work on the business problem?" So I think this is the issue that you see with customers is, hey, go use us for building these data pipelines and go focus on the business problem.
Taylor Brown (5:50)
Absolutely.
Jonathan Lincheck (5:52)
So Anjan, considering all those challenges you two both discussed, what innovations is Fivetran specifically working to address them?
Anjan Kundavaram (5:58)
Yeah. One, so we've been working with customers for a long time. And one of the things we've heard consistently is, “We like the SaaS application. It's so simple to use. I can just get up and running. I don't need to worry about managing.” But some customers, certainly with sensitive data, want to keep the data in their network. The data should never leave their network. So we've been innovative with customers and built and launched our hybrid deployment offering.
And the way that works is one, you get the benefits of SaaS, and you get the security guarantees of say, an on-prem solution. And it's split into control plane and data plane. So control plane is very metadata, your configuration. Data plane runs on the premises of the customer. That could be on-prem, that could be a VPC. That could be the customer's cloud. It doesn't matter. So as the data moves through the data plane, you get the benefits of sort of both sides. And we've validated this with financial institutions with different institutions.
They like the model. They're like, "Hey, I don't need to go build pipelines to move sensitive data. I can just rely on Fivetran's offering." And so the way you do it, you can go into our product. You can download a simple agent. You could install that. You can connect to your local database. And we will manage that for you. And the benefit of that is it runs through your premises, data never leaves your network.
But the Fivetran product team, the engineering team, the support team, they are managing your pipelines, so it's the best of both worlds. So kind of the last frontier, if you will, is if you've had data pipelines that you couldn't move because they're sensitive, business is like, "Hey, I need you to move the data," your security team is saying, "No, you can't," now, we finally have a solution. So very excited, and we're getting great adoption on that, Jonathan.
Jonathan Lincheck (7:44)
Taylor, how do you think that hybrid is going to change the market? And why is it so important for the data movement space?
Taylor Brown (7:50)
Yeah. I mean, at the end of the day, people have to move their data, and as I mentioned, the challenges with thousands of different data sources is they can't do it manually and they need to have a managed service. Now a managed service in the past has meant something that's cloud. And obviously, as Anjan mentioned, there's a lot of data sources that customers need to keep very secure, they need to keep on-premise for whatever reason. And so this is the best of cloud with the best of on-premise put together in a single product. And customers have been asking for this since 2015-ish, but it's a very challenging proposition to be able to do both of these things. And we've spent a lot of time over the last few years building this. And now we're excited to be launching it.
Other companies like National Australia Bank is a company that's using Hybrid Deployment. For them, they basically wanted to move faster. They wanted to be on the cutting edge. And so Hybrid Deployment offers them an ability to move all of their data, in a secure way behind their firewalls, but managed through fivetran.com.
And what they've seen as a result of it is a 30% less cost on ingest. And they've seen a much faster speed in terms of replication for their ingest and setup for their ingest. So these are two examples that I think are really pertinent.
Anjan Kundavaram (9:05)
Yeah. Taylor, I talked to a customer recently, a large bank, and they have 90 separate installs of databases across the globe. And so on-prem, they have so many installs of replication, and they have to manage it. And their data team is constantly thinking, "Which update? Where do I manage it? Where's the data movement? Where's an outage?" And when we told them about Hybrid, they're like, "Oh, okay. We're going to get on that," because it saves them so much time to go work on something that actually matters to them.
Taylor Brown (9:32)
Absolutely. Another company we're working with has 2,500 databases. Same kind of setup. Right?
Anjan Kundavaram (9:38)
It's like more scale.
Taylor Brown (9:39)
Each one of these has to have the same type of replication, but it's behind their customer's firewalls. This allows them to have a very automated approach towards, hey, we're going to install this instance. We're going to run it all behind your firewall. We're only going to take the data out that needs to be taken out. And we're going to do it in a programmatic way because 2,500 databases, there's no way you can possibly manage that if you're doing one-offs. It'll take you five years just to get going.
Anjan Kundavaram (10:01)
Yeah. I don't think there's anybody in our industry doing this for data movement, like with the comprehensive set of data sources and the security guarantees and the simplicity of the experience that we provide, so I think it'll be a very good product.
Taylor Brown (10:14)
Absolutely. And then the crazy part is once you actually, customers have the data within their cloud data platforms, then they can start to build all these really great AI strategies on top of that, as we're seeing with some of the best AI companies in the world, that are our customers, for example, OpenAI.
Anjan Kundavaram (10:30)
Absolutely.
Jonathan Lincheck (10:39)
That's a perfect segue to my next question, which really is around AI and ML and the impact that it's making on helping businesses make better decisions. Anjan, what do you feel Fivetran's role is in that transformation?
Anjan Kundavaram (10:41)
Yeah, it's a question of our times. Right? The first thing I want to point out, and I think most of you know this, is LLMs are trained on the web. Right? A search engine was built, you go crawl the internet, make a copy of it, and train it for a long time and use lots of GPUs, you get a really good LLM. But those LLMs don't have your company's DNA. They don't have how you build your customer adoption plan. They don't have how you build your product strategy, so that's a big limitation. And sort of the core value prop of Fivetran is to bring all your data in one place so LLMs can consume.
The two models through which you're seeing that is sort of RAG applications, retrieval-augmented generation, and then sort of fine-tuning. So the bare minimum for a RAG application to work is your data needs to be in one place and creating vector embeddings. If you need to be best in class, to really consume that information, you probably need to go do further transformations because even a generic embeddings model isn't going to really understand what your company DNA is, so you need to do vector transformations or you need to do fine-tuning. And for both those strategies, you need coherent data strategy data in one place.
The second theme we're seeing as we talk to our customers is unstructured data. And unstructured data comes in lots of formats. It could be just textual data in Zendesk tickets or Slack information. Or it could be an attachment in a Salesforce, and that's all context that you couldn't really leverage before on analytics. Now with foundation models and LLMs, you really can. So it's paramount you really think about a holistic data strategy and get that data in one place.
And then finally, we've been talking about sort of sensitive data and how that plays a part, so you continue to go move that. We are very excited by this trend. We think we can help our customers a lot more, and we're sort of foundational as to what we're doing. And we're continuing to innovate on these dimensions with our customers.
Jonathan Lincheck (12:41)
Incredible. Taylor, the other trend we’re hearing a lot about that seems to be moving the market is around open data formats. How is Fivetran leading the way in helping customers understand and take advantage of this new innovation?
Taylor Brown (12:53)
Great question. So a lot of customers want to load data into what is known as a data lake. And the difference between a data lake and a data warehouse is that a data lake is somewhere where it's a commodity layer, storage layer, where you can just put in unstructured or structured data. And the challenge with it has been that there's not been a standard way to organize, secure that data that didn't just turn into a big data swamp, meaning you just dump everything in and you spend a lot of time sort of trying to make sense of that data.
And in the last few years, these open file formats like Delta or Iceberg have come onto the market, which are ACID compliant, which effectively they take the sort of best organizational pieces of a data warehouse, but they make it available within a commodity storage layer like S3. This is really the holy grail for customers because ultimately, what they want to do is they want to just move all their data into one location. They want it to be cheap. They want to have ownership over that. They don't want to get locked into a single platform. And they want to be able to use that in a lot of different places, for AI, for ML, and there's a whole plethora of different things that customers want to do with that data, from putting it back into applications.
And so what Fivetran has built is in a loader for what we call our Fivetran Data Lake Service for loading data into Iceberg or Delta automatically. We do all the work to transform that data and to update it within the warehouse, which is a fair amount of work. And the sort of benefits that our customers see from this, besides being more future-proofed, is that they have a lot of cost savings on the ingestion side. So when customers are loading into a cloud data warehouse or even a database, part of the cost that they're incurring there is just actually loading that data.
And what we've found is it's quite high. It's about 35%, it can be upwards of 50% for some use cases, where you're just spending 50% of your overall data warehouse bill in moving data into it. And so with Fivetran, that ingestion is now free. And then they can query that data directly from whatever cloud data warehouse, like Snowflake, or Databricks, or Starburst, whatever else that they want to use. This is a massive game changer, and I really think the whole industry is going to move towards this architecture over the next few years.
Anjan Kundavaram (15:19)
Yeah. I think this is a very exciting development. The whole ecosystem is sort of pushing us to be more open. Customers are trying to value engineer, get more return on their analytics dollars. And the open data formats and the lake house architecture and the data lake architecture is letting us do that. Make one data format, one independent storage layer, and you can go use any number of query engines. That's choice. You talked about cost savings. They can go do more things with that, so that's exciting.
And then really exciting to also see Snowflake and Databricks drive and push for this open format innovation, they're pushing the catalogs with Polaris and Unity. So I think the entire industry is pushing, and it's going to drive a lot of value to customers. And we at Fivetran are going to play a huge role because we're going to enable this shift, this paradigm shift, where customers can try out new engines and kind of do their analytics in a very cost-effective manner.
Jonathan Lincheck (16:24)
What is something else that excites you about where the market is going? And what predictions do you have for the next year?
Anjan Kundavaram (16:29)
Open data lakes, I think that's going to be very pervasive. Most customers are going to leverage that. We're going to see an explosion of query engines, likely. And if you're a customer that often scans data off a gig, you might want to pick a query engine that's purpose built for that. You may see a new innovation in the type of query engines that run on Iceberg or Delta, so that's going to be a very exciting time for us.
Jonathan Lincheck (16:59)
Taylor?
Taylor Brown (17:00)
There's sort of two things, I would say. The first one is definitely, I think the whole industry is moving towards data lakes and that will be the future. It sort of feels like 2012 for data warehouses. And we're just going to see a whole lot more investment from companies, from partners, from the ecosystem to support data lakes and building around data lakes. And ultimately, that's the best for the customer in the short-term and the long-term.
The other thing in thinking about what role AI plays in the world today and in the future, I've been thinking Fivetran's mission is to make access to data as simple and reliable as electricity. And that analogy is really from when Thomas Edison brought electricity into the house for the light bulb. And so if you think about that analogy as BI, or business intelligence, is really the light bulb, I would say the hair dryer and all the other thousands of things that were built because the access to electricity in the house was so easy is really what's going to happen with AI. So we're already seeing a lot of innovation that's happening now because the access is there where companies can now build all kinds of different things downstream of this really easy access to all this really critical information that they have, that's all internal information. It's not just public information.
And so I'm excited just to see what happens. Thomas Edison didn't know that the hair dryer, or the washing machine or the electric shaver was going to come out. Right? But that all happened as a result of the hard work that he put in. I think that's the work that Fivetran is doing for companies today.
Jonathan Lincheck (18:31)
What a great analogy. I love that. So once again, thank you both for joining today. What an exciting time to be in the data space. I want to thank everyone for watching today, and if you’d like more information on anything you’ve learned, please feel free to go to fivetran.com. Thank you so much.