Fivetran at Databricks Data + AI Summit 2024: Key takeaways

June 28, 2024

Charles Wang

Lead Product Evangelist

Fivetran

Sydney Ceccato

Partner Marketing Manager

SUJETS

A longtime top innovation partner to Databricks, we continue to demonstrate the power of better analytics through automation

As a multiple-time Databricks Technology Partner of the Year and strategic partner, Fivetran was a key presence at Data + AI Summit 2024.

Our team led or participated in several sessions:

Fivetran COO Taylor Brown spoke with Joanna Gurry, Executive of Data Platforms at National Australia Bank (NAB) at a customer session, discussing NAB’s infrastructure modernization journey.
Senior Global Director of Sales Engineering Kelly Kohlleffel gave a presentation called “Build your data legacy” demonstrating the synergies between Fivetran and the Databricks ecosystem and how they can radically streamline your data architecture and accelerate your analytics.
Our new CPO Anjan Kundavaram participated in a panel discussion with leading industry figures called “Make generative AI work.”

Anecdotally, at last year’s Data + AI Summit, generative AI was the talk of town. Compared with last year, the 2024 Data + AI Summit saw a heavier emphasis on infrastructure, governance and security. Although the promise of generative AI remains real, organizations and innovators now recognize that certain foundational needs cannot be leapfrogged. Our sessions demonstrate this maturation of thinking.

Data innovation with National Australia Bank

Joanna Gurry at National Australia Bank, sat down with Taylor Brown at Databricks Data + AI Summit to discuss NAB’s infrastructure modernization efforts. Despite the scale of NAB’s operations, it has become an industry leader in the adoption of modern, cloud-native data technologies. With the help of Fivetran and Databricks, NAB undertook an infrastructure modernization project called Project Ada.

Gurry shared four key lessons from Project Ada that may help other organizations that want to undergo similar modernization efforts:

Change should be business-led

Infrastructure modernization should be informed primarily by the business consequences of failing to update technologies, rather than experimenting with new tools and platforms for the sake of novelty. Business stakeholders should be able to provide the perspective needed to set the appropriate priorities.

Changes require clear communication and management

Data engineering teams may be deeply professionally (and even emotionally) attached to legacy systems. Their buy-in is critical and must be gained by clearly communicating the business and technical benefits of change, paying appropriate respect to the work they have previously done and offering a leading role in adopting and continuing to use new tools and platforms.

Reward successes and positive outcomes

As a corollary to the previous lesson, the outcomes of a successful infrastructure modernization program should be very clear to all involved and will deserve heaps of recognition, especially for technical stakeholders who may have otherwise been attached to the old way of doing things.

Focus on the basics, including risk management

Security and governance are perennial issues in every industry, though felt especially acutely in financial services. It behooves organizations to not lose sight of the fact that establishing a solid, secure and governed foundation of data is essential before pursuing more innovative and speculative projects.

See the full talk here.

Build your data legacy

Kelly Kohleffel’s presentation emphasized the importance of infrastructure modernization, remarking that Gartner estimates a massive and radical shift toward the cloud in the coming years.

Organizations of all kinds face the following challenges:

Remote, distributed teams that rely on data for decision support and identifying opportunities
Budget and headcount constraints in a competitive, uncertain macroeconomic environment
Increasing volumes and variety of data, workloads and tools and technologies to integrate, manage and support

To meet these needs, organizations must modernize data infrastructure, including both destinations and pipelines. Successful data modernization starts with three capabilities:

A platform that can move all your data – Fivetran, combined with the Databricks Data Intelligence Platform, offers automation, reliability and scalability across a wide variety of different sources, ranging from applications, ERPs, files, event streams and databases. From the standpoint of the user, it must be as simple as possible, working securely, reliably and performantly out of the box with minimal configuration.
Support for hybrid environments – Enterprises often use a combination of cloud and on-premises tools and platforms. Fivetran offers the ability to meet data where it is and move it where it is needed; an essential capability for infrastructure modernization.
Data readiness for innovation and GenAI – A solid foundation of data that addresses points #1 and #2 enables organizations to combine their proprietary data with foundation models and produce unique, innovative data products.

Watch the full presentation here.

Make generative AI work

Our new CPO Anjan Kundavaram participated in an all-star panel discussion about generative AI with Amit Prakash, Co-founder and CTO of Thoughtspot; Bruno Aziza, Partner at Capital G; and Sudhir Hasbe, Chief Product Officer at Neo4j. Some highlights:

What are barriers to success in generative AI?

LLMs are probabilistic but most enterprise tasks are deterministic. Models need comprehensive data as well as fine-tuning.
Companies need to prioritize data quality and centralization, including of unstructured data.
Companies often spend a lot of energy on RAG without getting the basics of infrastructure right. LLMs are more powerful than standard queries and therefore stand to be both more helpful as well as more harmful.

What are technological and organization steps to ensure quality, deterministic results from genAI?

LLM must be provided with a good semantic model/ontology (i.e. map out and define all the concepts involved in the conduct of your company’s operations) and large volumes of relevant data.
Successful AI implementation requires three bodies of knowledge:some text
- Common sense, public knowledge
- Proprietary data from your systems
- Proprietary knowledge from the company – this last one is often underappreciated but is required to enforce accurate LLM outputs

For more notable quotables, listen to the discussion here

Interviews with theCUBE

Anjan and George each shared their thoughts at length with theCUBE, a publication by SiliconANGLE Media.

In Anjan’s conversation with Savannah Peterson and John Furrier, he emphasized the importance of the user-facing simplicity of a tool like Fivetran. Fivetran can involve as little as ”four clicks, and you move your data.” At the same time, such a simple interface conceals extraordinary complexity under the hood. Anjan likened Fivetran to a plumbing company and data to a utility like running water. As generative AI grows in importance, so will the need for performant, reliable solutions that can handle the huge range of different data sources involved in an organization’s operations, as well as open, modular data architectures and formats that allow teams to customize and standardize exactly as needed.

During George’s conversation with Savannah Peterson, he added that delegating responsibility is emotionally challenging for many data professionals, but the tradeoff in terms of extreme throughput, low latency, high reliability and sheer variety of data sources overwhelming tilts in favor of automated data integration. Reacting to connectors built by other vendors, George remarked “It’s easy to build a connector that works some of the time, even most of the time. It is extremely difficult to make a connector work 99.9% of the time.” For larger companies operating on longer timescales, the ease of setting up a connector is less important than the ease of maintaining it indefinitely.

Anjan and George covered many other topics as well, including personal reflections on their careers.

Live from the Lakehouse

Taylor Brown joined Holly Smith and Kobie Crawford of Databricks for a candid conversation from the Expo floor about the state of Fivetran and the broader data industry. Taylor discussed the recent launch of Fivetran Managed Data Lake Service, which supports large data volumes and AI workloads in a data lake with a flexible, scalable and secure platform. The conversation covered other topics such as National Australia Bank’s successful infrastructure modernization program, findings on the importance of data integration from a recent MIT survey of C-suite executives and the capabilities that Fivetran offers the governed data lake.

Watch the full conversation here.

Parting thoughts

During the Data + AI Summit Keynote, Ali Ghodsi, CEO of Databricks, made an appeal to modularity, future-proofing and avoiding vendor lock-in. He envisions a future in which storage and compute alike are pieces of a flexible architecture that can be easily tailored to an organization’s specific needs and constraints, and in which standardization applies mainly to table formats for common legibility, much like the USB standard.

Fivetran’s participation in Data + AI Summit demonstrates our confidence in this vision and we look forward to continuing to use our partnership to power analytics of all kinds.

Consider registering for our followup hands-on lab, in which we’ll demonstrate how you can stand up your first retrieval-augmented generation (RAG) application in no more than 60 minutes using Fivetran and Databricks.

[CTA_MODULE]

‍

Learn to build your first RAG application with Fivetran and Databricks in this hands-on lab

Try it out

Topics

Databricks

Data Lakes

Heading

Commencer gratuitement

Rejoignez les milliers d’entreprises qui utilisent Fivetran pour centraliser et transformer leur data.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Obtenir une démo