Blog

Data mesh and the Snowflake World Tour

Andreas Bowander Data Consultant, Solita

Published 03 Nov 2025

Reading time 9 min

The Snowflake World Tour came to Stockholm this October, offering a full day of insightful presentations from Snowflake, their customers, and partners. We recently renewed the highest-level Snowflake Elite Partner status, as well as being honoured with two awards at Snowflake Partner Awards the evening prior, and of course, took part as an exhibitor. I was fortunate enough to visit the conference. With four simultaneous presentations throughout seven breakout sessions, it was impossible to cover everything. My goal was to learn as much as possible about how companies are adopting data mesh with Snowflake’s help.

Decentralisation is a term that might linger with you after hearing about this sociotechnical framework, as one of its principles emphasises that business domains should own the data they produce and consume. Looking for data mesh insights at a single-vendor conference for something as historically centralised as a data warehouse might seem surprising. But Snowflake’s platform unifies rather than centralises; however, I was surprised that more than half of the sessions included a presentation on data mesh!

What is data mesh?

Let’s explore what data mesh is before digging into the takeaways from the presentations. The concept was introduced by Zhamak Dehghani in 2019 with the blog post How to move beyond a monolithic data lake to a distributed data mesh, later thoroughly formalised in the 2022 book on delivering data-driven value at scale. There it’s classified as a sociotechnical paradigm with four interrelating principles at its core.

Principle of domain ownership

Domain-driven data ownership makes the domains responsible for the correctness of the data they produce. The transformation needed for curated data, the data pipeline, is moved into the domain. This way, the data is aligned to its source while also readily available for potential aggregation and cross-domain use.

Principle of data as a product

To increase the usability of the domain data and mitigate the silo effect that could come with decentralisation, product thinking is applied to the modelling and sharing of data. The concept of data products is what’s most prominent coming out of the principles, and there are several standardisation efforts ongoing, e.g. Open Data Product Standard (OPDS) by bitol and Data Product Ontology (DPROD) by EKGF. Dehghani laid out eight characteristics as the baseline for a data product to be considered useful, namely

  1. Discoverable: Easily explorable by data users
  2. Addressable: Has a distinct identifier serving as entry to all associated information, including documentation and service-level objectives
  3. Understandable: Clearly communicates the entities it encapsulates, their relationships, and adjacent data products
  4. Trustworthy and truthful: Bridges the gap between users’ known and unknown aspects to foster trust in the data’s reliability
  5. Natively accessible: Enables various data users to access and read its data using their preferred access methods
  6. Interoperable: Follows standards and harmonisation rules for seamless cross-domain data linking
  7. Valuable on its own: Contains a dataset that holds intrinsic value independently and offers inherent business and customer value
  8. Secure: Ensures secure access with confidentiality-preserving measures

Principle of the self-serve data platform

The self-serve platform hosts domain-agnostic infrastructure capabilities built and maintained by a platform team. Existing technologies can be leveraged to provide capabilities with data mesh differentiating characteristics.

  • Serving autonomous domain-oriented teams: The platform must allow teams within the domain to build, share and use data products autonomously in an end-to-end fashion while not being dependent on centralised data teams.
  • Managing autonomous and interoperable data products: The data products coming out of the domain must also be autonomous, upholding data product characteristics while being able to interconnect with other data products in the mesh without intermediate centralised assistance.
  • A continuous platform of operational and analytical capabilities: Domain ownership requires a platform enabling autonomous teams to manage data end-to-end, bridging operational and analytical planes. A data mesh platform must deliver a connected user experience for both application development and data product usage.
  • Designed for a generalist majority: Adopt open conventions for tech interoperability, empowering generalists with accessible tools to drive scalable data initiatives.
  • Favouring decentralised technologies: Data mesh emphasises decentralisation via domain ownership to prevent synchronization bottlenecks and accelerate change. An effective self-serve platform balances centralised resource management with independent team autonomy for end-to-end data sharing, control, and governance.
  • Domain agnostic: Traditionally, there’s often no clear delineation between the team preparing data for analytical use and the maintenance of the infrastructure that supports it. The self-serve platform should balance domain-agnostic capabilities while enabling capabilities not needed by all domains.

Principle of federated computational governance

For data mesh to function as an ecosystem, domains must adhere to global standards and policies despite retaining their autonomy. Key components for achieving federated computational governance are system thinking and computational policies combined with a federated operating model. A cross-functional team composed of domain data product owners and representatives from various organisational parts, such as legal, compliance, security, and platform, sets guiding values from which global policies are derived. These policies are implemented via the self-serve platform to enable both interoperability between and governance of data products.

Data mesh on Snowflake case studies

Since the principles are intertwined, we’ll consider them collectively rather than separately as we examine how some of Snowflake’s customers have adopted data mesh (i.e. the ones who presented on the topic at Snowflake World Tour Stockholm). We’ll provide an overview rather than detailing individual efforts.

All companies that presented on data mesh have shifted to empowering domains to take data ownership and integrate Snowflake into their self-serve platforms. The domains enjoy varying degrees of autonomy, scaling up and down as needed, bridging operational and analytical planes, and receiving assistance from the platform team when required competencies are missing. In one instance, a central team created group-wide data products alongside the domains; however, this role will diminish as domain competencies grow.

Resource provisioning in the data platform, such as schemas and databases, is managed via configuration files in YAML format, enabling multiple companies to abstract away the infrastructure aspect of their self-serve platform.

Several presenters emphasized considering source-aligned data products as in-domain master data. All companies made their data products discoverable through catalogues, either through third-party alternatives or Snowflake’s Internal Marketplace and Direct Sharing. Consumer-aligned and aggregated data products were also published, with the understanding that they can build upon each other for new possibilities.

Upon publication, data products are considered contextual, with producers and consumers on equal footing regarding expectations. Centralised compliance and best practices are enforced within domains through a shared information model. As the mesh grows, opportunities for cross-domain analysis will increase, overcoming silos. Interoperability between both data products and domains is crucial, with the agentic mesh’s prominence highlighting the importance of understanding data contracts.

To make data products discoverable and explore the mesh (i.e., what products derive from a given source table), object tagging in Snowflake can be leveraged. Tags enable cross-domain lineage tracking to prevent duplication of data products and aid governance by keeping track of policy relevance and enforcing legal obligations.

Snowflake services that help knit the mesh

In the intersection of the presentations and the whitepaper on How to knit your data mesh on Snowflake we find multiple Snowflake capabilities to elaborate on. While previously mentioned resources have been tech-agnostic and material from Snowflake mainly focuses on what technical and architectural support their platform provides, it also does a good job sorting out the principles of data mesh.

Organisational topology

Using Snowflake to abstract away infrastructure provisioning was successfully achieved in more than one case. Solely using the platform for this might, however, result in building services already provided by Snowflake. Whether you build or buy, from either Snowflake or a third party, with proper federation of governance, it’s possible to have an interconnected mesh either way. Organising into data mesh domains in Snowflake can be achieved either at the account-level or more granularly as databases or schemas; the latter seemed to be most popularly used by presenting companies. The chosen level for domains in the platform architecture may very well depend on how the company is organised. As often discussed in data mesh discourse and reiterated in the whitepaper, Conway’s law states:

Organisations which design systems are constrained to produce designs which are copies of the communication structures of these organisations.

Acknowledging this fact, it is crucial to ensure that both domains and data products are interoperable across the mesh and to avoid creating organisational silos. Snowflake recommends a common data product layer for harmonisation when sharing data products across a mesh of heterogeneous technologies. Not all data from other systems and technologies needs to be replicated into Snowflake, but enough must be replicated in order for the self-serve platform of the ecosystem to maintain consistent governance, access control, and cross-domain interoperability.

Heterogeneous domains

Figure 1: Part of the image showing heterogeneous domains in How to knit Your data mesh on Snowflake

Elastic compute

Snowflake’s architecture separates storage and compute, allowing them to scale independently. This flexibility enables elastic scaling of compute resources up or down based on the demands of your data workloads and applications.

Snowflake Horizon Catalog

Horizon Catalog breaks down data silos by offering a unified, global repository for structured, semi-structured, unstructured data, models, apps, and listings across clouds and regions. It ensures security through robust network security, identity management, continuous risk monitoring, and centralised role-based access control (RBAC). The platform facilitates compliance with sensitive data detection tools, granular authorisation policies, data quality monitoring, and data lineage visualisation. Privacy is safeguarded via synthetic data generation, differential privacy policies, and Snowflake Data Clean Rooms for analytics collaboration. It enables seamless content discovery regardless of format or location, secure collaboration without data movement, and integration with Snowflake Marketplace. Powered by Snowflake Cortex AI for automation, it integrates with Apache Polaris-compatible open catalogues, enabling global discovery and sharing while maintaining rigorous governance and security across regions and clouds.

The Horizon Catalog had its own presentation titled “Governance and security for data and AI with Horizon Catalog” and is clearly actively developed.

Snowflake Internal Marketplace

The Internal Marketplace, like the public Snowflake Marketplace, enables easy sharing of curated data within your organisation. By leveraging existing role-based access control, it ensures accessible and secure data for authorised users. Data products are discoverable through listings for cross-domain use, with monitoring of access and usage statistics enabled. Accompanying metadata and sample data make published data products more understandable.

Object tagging

Snowflake allows tagging nearly any object with customizable key-value pairs for discovery, tracking, access control, monitoring, and auditing. Tags inherit object hierarchies and enable automatic application of tag-based access restrictions. Data product owners can use tags to annotate objects like schemas, tables, or columns with metadata, such as sensitivity levels or data categories. Users can create, modify, or delete custom tags, optionally specifying allowed values. In a Data Mesh context, federated governance involves defining global standards for tags and policies, with domain owners responsible for applying these to their data products.

Below is a table from the whitepaper that summarises relevant Snowflake capabilities by the data mesh principle.

Relevant Snowflake capabilities

Figure 2: Table summarising relevant Snowflake capabilities

“AI is a data product”

I can’t recall who wrote the above, but I recently read it in a comment thread, and it resonated with me. It has long been recognised within the data science community that existed prior to generative AI that model performance is highly dependent on the data it’s fed. It comes as no surprise that the age-old ‘garbage in, garbage out’ principle still applies. Therefore, the suitability of the data foundation upstream of any AI model, whether it’s consumed by chatting with your data through a language model or as the prediction or classification by machine learning algorithms, is critical.

All the companies presenting on data mesh have explored the use of generative AI to varying degrees. The most detailed example showed how one company used Cortex AI along with a model trained on tags to assist in table tagging. Snowflake presented on leveraging AI SQL functions applied to diverse data types, e.g. unstructured text from PDFs or images. Additionally, it was described how semantic views provide context to conversational analytics. For those interested, further details can be found in their respective documentation.

Semantic models

Figure 3 – Picture from the presentation “Advanced analytics with Cortex AISQL, native semantic views, and more”

Our experience

Apart from the written work of Dehghani, I also recommend our learnings on data mesh, particularly in video format, through our free Crash Course on Data mesh. We’ve helped customers set up distributed Snowflake resource provisioning, scaling to accommodate more than a hundred teams all working individually in their multi-stage deployment environments. These workspaces are automatically created, access-controlled, monitored for observability of cost, performance, etc., and configured according to set guardrails and governance practices. Customers can onboard easily either through a user interface or programmatically. Workspace creation supports integration with self-service portals such as ServiceNow etc. Other tools and services, integrated with Snowflake or not, can be automated similarly.

If you’re considering exploring the adoption of data mesh or using Snowflake, please don’t hesitate to get in touch!

  1. Data
  2. Tech