Snowflake Summit 2024: The Future of AI and Data Integration
Last week was the annual Snowflake Summit held in San Francisco at the Moscone Center. Snowflake hosts this yearly, multi-day conference to showcase their new features through keynotes, to help partners share their Snowflake experiences in breakout sessions, and to offer learning labs for developers to level up their data integration skills.
Snowflake Summit 2023 led with AI news, featuring the announcement of Snowflake Container Services and the Snowflake/NVIDIA partnership, which allows users to run workloads on NVIDIA graphics processing units (GPUs). This year offered more AI announcements but was focused on AI management and use cases with updates to Cortex, AI development support and features, and the Polaris Catalog. Also, Snowflake announced the key feature that proves they’re now a real developer’s tool — Dark Mode!
Snowflake shared that Cortex Analyst and Cortex Search now let you quickly build chatbots with structured and unstructured data. For their Analyst product, Snowflake is leveraging Meta’s Llama 3 and Mistral Large models, and their search uses Snowflake Artic embed and Neeva to build on top of flat and text data. Snowflake also announced their DocumentAI, which uses Snowflake Artic-TILT, their multimodal LLM for visual document question answering. These announcements are a huge deal for companies with data sets they want to ask questions about or use to retrieve specific information.
At this year’s Summit, Snowflake confirmed they’re continuing their cross-vendor partnership with Google, Meta, Mistral AI, and Reka with their Snowflake AI & ML Studio, a no-code user interface for organizations to begin their AI journey. This tool allows you to build and test models to understand valuable business insights like cost and performance.
In other AI advancements, Snowflake announced their Notebook interface. These interfaces come with integrations to SnowparkML, Streamlit, and Cortex AI, and all the functions of the Snowflake platform. This gives users one place to develop Python, SQL, and Markdown scripts. Snowflake also revealed their feature store and ML Lineage which allows data scientists and machine learning (ML) engineers to maintain consistent ML features and trace the usage of datasets, features, and models. Rounding out their features to Snowpark, Snowflake added a pandas API to their library of tools allowing more development of models in a syntax commonly used across other platforms.
The Snowflake Summit 2024 announcement that I was the most excited by is the Polaris Catalog. This is a vendor-neutral open catalog implementation for Apache Iceberg, an open-source format for analytics tables that has become the go-to for data lake format. The decision by Snowflake to go open source allows for interoperability between key cloud tools like Amazon, Azure, Confluent, Demio, Google, and Salesforce.
In January 2022, Snowflake announced the ability to integrate Iceberg tables with their external tables so users could read Iceberg datasets. However, this came short in that any time a user wanted to store augmented data from Iceberg it would be written as a Snowflake Native Table. Snowflake’s managed Iceberg provides large efficiency gains. Hopefully, the Polaris Catalog will unlock these efficiencies since Snowflake provides end-to-end tooling for analytical updates to the data and a valuable partnership for other tools to create actions and insights from the data.
While the keynotes were proudly parading AI, on the vendor floor, the air was split between practitioner curiosity and vendor innovation. From the practitioner side, there was excitement about how enterprise organizations can leverage AI. This fact was not missed by Snowflake CEO, Sridhar Ramaswamy, “But, here’s the issue. The bar for AI use in enterprise is much higher than in consumer AI. Consumer AI is not ready for business use.”
Even in organizations with a lower threshold for acceptance, effective AI rollout is a mystery. There are two emerging AI journeys of developing the traditional AI use case and building language learning models (LLMs) that create efficiencies. Snowflake’s keynote announcements underlined these two journeys. Organizations looking to develop and roll out AI use cases must have strong data governance. These tightly governed datasets allow organizations to trust that their models will provide repeatable and usable outputs, while also monitoring the efficacy of their models and the cost of running them. Using LLMs to create chatbots helps get answers to questions that in the past would’ve meant digging through multiple locations and skimming PDF and Word documents.
Most organizations I talked to feel stymied by the effort needed to maintain AI pipelines and keep up with all the new AI products that launch. Snowflake’s partnership with tools like Meta shows they want this to be a cohesive universe for their customers to leverage the developments in the space. And the space is moving at lightning speed. Chat GPT-4 released March 2023, while GPT-4o released May 2024 is twice as fast and half as expensive. The constant developments, though hard to follow, likely mean great news for consumers looking to use their platforms instead of building their own.
Vendors are also seizing the AI day. Every vendor booth incorporated the famous two letters on their marketing copy, with the bleeding edge technologists launching their own co-pilot to develop faster. The most enlightening new vendors are those solving text to insights as fast as possible. Some of these solutions are created through schema comprehension, and others involve reviewing query maps and giving optimization strategies.
As I left Snowflake Summit 2024, it felt like the work world I’d be reentering would never be the same. After some needed sleep, tuning back into work this week I’m still focused on the same key problems of the past, “this pipeline is broken”, “this metric doesn’t look right to me”, “I’ve got a quick ask to get data from this esoteric system with no API documentation”. The big difference is that I know these challenges are turbulence on the path to building a leading data platform that informs and drives impacts for customers across my organization.
Want to know more about how these exciting announcements at Snowflake Summit 2024 can benefit your organization? Reach out. Our team of experts would be happy to chat more with you about Snowflake’s latest features and functionality.