Snowflake Data Cloud Summit 2024: opening keynote recap

The Cloud Shuttle team is thrilled to be part of the Snowflake Data Cloud Summit and we're to be reporting live from the Moscone Center in San Francisco!

The first day of the summit has been off to a rocking start, with swag, coffees, and a roaring hallway track in the expo hall.

The Hallway track - one of the most valuable in any conference!

Setting aside our jetlag, we're excited to put on our journalistic cap to present a summary of the opening keynote that took place on the tail end of Day 1.

The stage is all set for the opening keynote.

Enterprise AI, it's all about Enterprise AI

Snowflake CEO Sridhar Ramawamy at the opening keynote.

Snowflake CEO Sridhar Ramaswamy opened the keynote by welcoming attendees to the Snowflake Summit, what he calls the "epicentre of all things AI and Data". Only 3 months into the CEO role, Ramaswamy clearly wants to cement his legacy as a CEO who is making big bets on enabling enterprises to embed AI quicker, faster and cheaper. (A stark contrast to CEOs who have been more reticent and cautious on going all in on AI).

Ramaswamy's thesis is a story of AI democratisation. With closer to 10,000 customers collectively running an average of 5 billion queries each day, Ramaswamy is acutely aware that Snowflake's well-positioned to make the adoption of AI a lot easier for their customers. He stresses the much higher bar needed for Enterprise-grade AI compared to broader, consumed-based AI models. Enterprise AI has to be a lot more trustworthy, reliable, private and secure. And cost-effective to boot - businesses want to know what it's going to cost them and what they'll get in return before they dive in.

Snowflake's going all in when it comes to their concept of the Data AI Cloud, the idea of a global, single, unified and connected way for businesses to connect their AI infrastructure and compute under one roof. (I mean surprise, surprise, it's practically the name of their summit!)

Introducing Polaris catalog

The first major announcement of the keynote was the introduction of Polaris Catalog, an open-source catalog for Apache Iceberg. Polaris removes the need to have a separate third party data catalog for Iceberg and allows Snowflake to provide a governance layer (via Snowflake Horizon) on top of your data stored in the data lake of your cloud provider. Interestingly, it allows for other query engines aside from Snowflake, such as Spark, Flink, PyIceberg and Trino. The public preview is coming soon and it will also be released to open source so that you'll have the option of hosting it yourself on Docker, Kubernetes or similar.

Leather jackets and fireside chats: Featuring NVIDIA Founder and CEO, Jensen Huang

Man of the moment, NVIDIA Founder and CEO Jensen Huang

We come to the reason why the queues for the keynote were snaking several lines deep in the Moscone Center, well before the keynote started at 5pm Pacific Time. A chance to glimpse the leather-jacketed man of the moment, NVIDIA Founder and CEO Jensen Huang. While Jensen Huang wasn't there in-person, attendees were given the opportunity to hear him dial in live from Taipei.

AI is defying Moore's Law, Huang says. Instead of the 2 x advancement every two years as observed by Moore, AI is moving at the rate of 2 x advancement in a quarter of that time - every 6 months!

The two big announcements from Ramaswarmy and Huang's fireside chat were as follows:

1. Snowflake Artic is now supported with NVIDIA TensorRT-LLM

Snowflake Artic, Snowflake's set of enterprise LLMs, are now supported with NVIDIA TensorRT-LLM. NVIDIA TensorRT-LLM is an open source project that allows for the easy optimisation of GPU hardware for LLM training and inference. The amount of computation and energy to meet the needs of GenAI can be absolutely astounding, so as Ramaswamy was saying earlier, there needs to be a way to minimise costs and reduce total cost of ownership (TCO) to enable much broader Gen AI adoption among enterprises.

Engineers can experiment with new LLMs without really needing deep knowledge on C++ or NIVIDIA CUDA. In some cases, performance can be increased by up to 8x, meaning it’s a no-brainer not only for the simplicity it provides but also the benefits it provides when it comes to cost performance.

2. Snowflake and NVIDIA are partnering to integrate NeMo Retriever into Snowflake Cortex AI

NVIDIA's NeMo provides information retrieval with high accuracy and powerful performance for enterprises building retrieval-augmented generation-based AI apps on Cortex AI (Snowflake's fully managed LLM service). The idea being that businesses can quickly build custom AI applications on their proprietary data without a ton of specialised AI expertise. Reduce time-to-market, cost and security concerns, since it's all happening within the Snowflake platform and underpinned by Snowflake's governance framework.

The semantic query library allows businesses to embed proprietary data and do indexing, searching, and direct interaction with the data, as well as connecting to other microservices. As Huang notes, the NVIDIA partnership with Snowflake is about bringing high-performance compute directly to the data (a reversal of the traditional way of bringing data to compute).

Customer panel: moderated by Snowflake CMO, Denise Persson

The Snowflake customer panel hosted directly after the opening speech and fireside chat by Snowflake and NVIDIA's CEOs.

Right after the fireside chat, a panel discussion took place, consisting of several Chief Data Officers from some of America's biggest organisations. Moderated by Snowflake CMO, Denise Persson, the panelists represented a wide variety of sectors, from banking and healthcare to communications and travel.

The panel lineup:

Anu Jain, Head of Data and Technology at JP Morgan Chase
Caitlin Halferty, Chief Data Officer at Ericsson
Shahran Haider, Deputy Chief Data Officer at NYC Health + Hospitals (the largest public health system in the US, with over 1.5 million patients a year)
Thomas Davey, Chief Data Officer at Booking.com

Some of the key themes the panel touched on are as follows.

The role of the Chief Data Officer (CDO)

Caitlin Halferty talked about how the CDO role, once met with resistance, is now gaining much more acceptance and traction as the importance of data leadership, accountability, and data stewardship becomes even critical in the age of data and AI. Something Thomas Davey backed up with his observation that despite Booking.com's highly experimental and data-driven culture, the CDO role only really became a thing for then in 2022. While Anu Jain emphasised the importance of continuous learning and improvement, Shahran Haider's advice to CDOs was to prioritise the initiatives that drive impact and support the organisation's core mission, and to continuously build the organisation's data capability muscle.

Ensuring data strategy supports the core business mission

As GenAI comes to the fore, data teams are being inundated with a long list of demands and wishlists; it's a key responsibility of CDOs to prioritise these requests according to business impact and alignment with broader company strategy. For example, Shahran's Haider's example from the healthcare sector was about how any AI explorations always go back to how, and to what extent they can address the highest priorities related to optimising healthcare delivery, reducing costs for patients, and improving the quality of care.

The Chief Data Officers shared several strategic initiatives they felt was crucial for success: ensuring data integrity; implementing governance and security systems; promoting data democratisation to empower users across their organisations; and maintaining operating models that support all business units and data consumers, regardless of their data maturity. And last but not least, attracting and retaining global talent.

The pace of GenAI adoption across different industries

What became quickly apparent from the panel was that different industries are at varying stages of adopting generative GenAI, each with opportunities and challenges unique to their circumstances.

Booking.com, for example, has long used traditional ML, and increasingly GenAI, to deliver personalised recommendations and services using chatbots and voice search technology. Many of their applications are customer-facing already, including an AI trip planner that provides personalised travel recommendations in natural language (currently available only in the US; a shame as we were keen to give it a go!). Ericsson has invested in AI initiatives for digital twin technology, which helps with working OHS and reducing waste by enabling remote and virtual inspections of physical sites.

In contrast, the banking. and healthcare sectors, much more regulated industries, have traditionally focused more on using AI for risk avoidance and cost reduction. For JP Morgan Chase, there are now more opportunities to leverage AI for revenue generation, personalised offers, and intelligent customer service routing. NYC Health and Hospitals, being a public health system, is taking a cautious approach, first setting up an advisory board and communities of practice for ethical AI usage, data safety, and to ensure real-world biases are not baked into AI in a way that would adversely affect patient outcomes.

Conclusion

All in all, it's been a packed first day at the Snowflake Data Cloud Summit with some very interesting sessions and an unsurprisingly huge focus on GenAI (especially Enterprise-level AI) with the main highlight being the collaboration between Snowflake and NVIDIA to make AI a lot faster, efficient and cheaper for enterprises to adopt and to leverage their most precious asset - their proprietary data - to drive business value.

As the summit continues this week, the Cloud Shuttle team look forward to bringing you more updates and insights. Stay tuned!