Data Mesh – Definition & Overview

What is data mesh?

Data mesh is an enterprise data management framework that defines how to manage business-domain-specific data in a way that allows business domains to own and operate their data. It empowers domain-specific data producers and consumers to collect, store, analyze, and manage data pipelines without the need for an intermediary data-management team.

Data mesh has its origins in distributed computing, where software components are shared among multiple computers running together as a system. With data mesh, the ownership of data is distributed across different business domains, and each domain is responsible for creating its data products. Data mesh also enables easier contextualization of data to generate deeper insights while concurrently facilitating more collaboration from domain owners to create solutions tailored to specific business needs.

SnapLogic Explains - What is data mesh?

How is data mesh defined?

Data mesh is a data platform architecture design approach for implementing a decentralized, distributed data analytics and data sharing architecture

Data mesh is a decentralized sociotechnical approach to share, access, and manage analytic data in complex and large environments – within and across environments.

From, “Data Mesh – Delivering Data-Driven Value at Scale” by Zhamak Dehghani, 2022

How does data mesh work? 

The architecture of data mesh has information stored across multiple sources, and a data formation service makes the data products available as permissioned tables. The data owner may also create and expose APIs that other users can consume. Data mesh also has a data catalog that stores metadata, such as table names, columns, and user-defined tags.

What are the data mesh principles?

The fundamental pillars of data mesh include four principles: decentralization via domain ownership, data as a product, self-serve data infrastructure, and federated computational governance. The four principles serve to describe data mesh and are important to produce the value from data and the agility from a modern architecture that companies seek as they grow.

Data mesh principle #1: domain ownership

This describes the decentralization of the ownership of data, i.e., the responsibility of the data, to the business domains that are closest to it. Essentially, business domains own their data rather than a centralized IT function. However, IT may play a role in helping business domains to harness and extract the power of its data. Domain ownership is critical for companies to realize scale and avoid bottlenecks through a centralized data flow structure.

Data mesh principle #2: data as a product

With a decentralized domain-owned (or domain-oriented) structure, data is shared with other users and consumers interested in the data. Examples of data as a product may include a data set for analytics or data for a delivered service. Domain owners of data may share data as they best see fit to produce a desired business outcome. Data as a product should have the minimum characteristics of being discoverable, addressable, understandable, trustworthy, truthful, and secure. 

Data mesh principle #3: self-serve data platform

For business domains to realize data as a product, to share with others, business domains must be empowered to do so. The goal of self-service is to remove friction from the end-to-end data journey, from source to consumption. Business domains or individual data owners are then in the position to develop and enhance the data and define the parameters for which data is shared. Platform infrastructure capabilities and automated governance policies make self-service possible.

Data mesh principle #4: federated computational governance

A broad and encompassing principle that spells out the data governance operating model based on federated decision-making, accountability, security, legal, compliance policies and more. Motivations for this principle include the desire to attain a higher-order value from aggregated data and to counter potential undesirable consequences of a domain-oriented, decentralized infrastructure.

What are the benefits of data mesh?

  • Decentralizing data ownership and data operations to accelerate the agility of business domains to make relevant decisions
  • Providing domain teams with the independence to choose the data technology stack that best meets their needs
  • Delivering transparency across cross-functional teams by reducing the likelihood of isolated data teams
  • Facilitating data sovereignty and data residency to ensure alignment with data governance regulations