If you haven’t had some one mention ‘Data Mesh’ to you yet, you probably will soon.
It’s a new approach to managing your data landscape, originally proposed by Zhamak Dehghani of ThoughtWorks. She’s publishing a book that explains the approach in detail, due out early 2022. But if you’d like to start thinking about it now, read on…
The Data Mesh concept borrows heavily from modern application development methodologies. It recognises that there can be problems with a traditional Data Warehouse approach. In particular, in a large and complex organisation too much is expected of the central data team. They’re supposed to understand everything about the company, manage multiple requests simultaneously, be both agile and produce systems with industrial strength and now increasingly integrate into downstream operational systems as well. That’s a lot to expect of any one team, and can lead to frustrating bottle-necks in the data management processes.
The solution proposed by Data Mesh is to organise your data into loosely coupled Domains, introduce a Product approach into your development, abstract away some of the infrastructure concerns and adopt a more federated approach to your data governance.
Let’s look at each of these in a little more detail:
This is the core concept of the Data Mesh. A Domain is a sensible, self-contained unit of data and services. Importantly the Domain encompasses both Operational and Analytics concerns. This means that the same logical unit needs to be able to handle transactional updates as well as a support aggregate and historical requests (e.g. Add Customer vs Count Customers Added in the last year). Domains are accessed via API’s (though in practice this can mean presenting an SQL interface or even Flat File History as well as RESTful or GraphQL interfaces).
The benefit of the Domain approach is that the knowledge of exactly how the domain operates, what it’s for and who consumes its output is contained within one unit. There’s no need to seek out the appropriate SME when a change is required, as that expert is deeply embedded in the team; they just have a wider responsibility to the organisation now.
In order to maintain high quality and to ensure that a Domain is actually fit for purpose Data Mesh proposes treating Domains as Data Products.
The product itself will consist of Code, Data, Metadata and Infrastructure and the Domain Data Product Owner is responsible for evolving that over time but also is accountable for quality of the product, which includes the trustworthiness of the data, the understandability of the documentation, the security approach and the performance of the services.
Find out more about this in our Data Product Management Whitepaper.
To prevent each domain taking its own approach on technology, the Data Mesh proposes abstracting Infrastructure away from the Data Product.
In practice that should mean that a Domain Development Team can access scalable storage, compute and query capability, pipeline and orchestration services, monitoring and lineage – without having to worry about the actual technical implementations under the covers (similar to how Virtual Machines are provisioned, you request an instance appropriate task at hand without really worrying about the underlying capacity).
Data Governance is a difficult task in any architecture, but the Data Mesh has the potential to compound the issues. The proposed approach to resolve this is a balanced governance model that recognises both local and global concerns.
Adherence to standards is a responsibility of the Data Domain itself. The evolution of those standards is a wider concern but will be defined by the domain representatives. In practice this is likely to still require some central co-ordination, but the accountability for quality and security has been federated along with the data products themselves
Data Mesh is a new concept, and very bold in it’s ambition. We can clearly see some benefits to adopting this approach, but it’s not for the faint of heart. Just like a transition to Service Oriented Architecture, this leap is a major paradigm shift and is likely to be very disruptive to established ways of working and organisational structure.
If you’d like to discuss Data Mesh more, and find out whether it’s an appropriate solution for you, contact us at firstname.lastname@example.org and we’ll be happy organise a session to discuss it in more detail.