Oakland

Is Data Mesh a potential Data Mess?

Since we re-emerged from lockdown, the Data & Analytics industry has seen a massive boost and investment in data as businesses escalate and turbocharge their data agendas on a technical and non-technical basis to ensure they have the best digital presence.  Throw in the emergence of Data Mesh as a potentially new and better way of organising, processing, and delivering data to organisations and you have all the potential ingredients for a perfect data storm.   

Over the last few weeks, Data Mesh has been the hottest topic at the data events I have attended. But interestingly, awareness and understanding vary considerably even amongst data leaders. Ranging from those who know something about Data Mesh. Typically they have read and watched the excellent Zharmak Dehghani from Thought Works https://www.youtube.com/watch?v=L_-fHo0ZkAo set out her rationale and vision. To those who have heard of the term, but know very little or nothing much about it beyond that. Through to those still in blissful ignorance that a new “data kid” is now coming onto the block!  Then, there are those who still wonder whether Data Mesh is the same as Data Fabric.

So it seems there is an awful lot of talk about the theory of Data Mesh, but much less visibility and debate about what is actually happening in practice in companies and the data teams themselves.

Taking the Data Mesh theory first, this is now starting to be better understood.  We know that there are four high-level foundational principles – all of which make sense.

1). Domain-driven data ownership

2). Data as a product

3). Self-serve infrastructure as a Platform

4). Federated Governance.

That said, we have after all been talking about these four things, in whole or part, on and off, in different guises for the last 20 years as the suggested Data best practice(s) for an organization.  Data Mesh shows how they can all be brought together.

So four very sensible principles that few who currently work in data would violently disagree with.  You then overlay on top of these the view that our existing data technologies, particularly Data Warehousing and Data Lake, have not delivered on their original and planned expectations.  Dehghani feels that these data solutions have failed in their objectives.  Perhaps a harsh statement given the many millions spent and continuing to be spent on them across the globe.  A further criticism is because they have not created a single version of the truth as intended and have become  centralised and monolithic structures that incur significant costs to build and maintain rather than the agile solution for which they were intended. Also, they have created siloed ways of working and have had limited success in providing strong and fresh data insights to business users. These criticisms make it easy to make the argument for a new technical data solution.  Oh, and lest we should forget, the acknowledgement that many companies have and still continue to struggle with the non-technical elements of managing data – e.g. Governance, Literacy, and Culture, then Data Mesh does seem to offer a way forward.

However, what did Alexander Pope famously say, “a little knowledge is a dangerous thing”?  As a data expert you can immediately sense the same is happening in the UK data community with Data Mesh.  A few of us have read the book and seen the videos but have very little practical experience of it and whether it can address all criticisms levelled at the current data landscape.

For example, when you start to scratch beneath the surface, it isn’t easy to find any UK organisation doing Data Mesh in any meaningful way.  Also, any company which is doing it seems to be in the US rather than based in the UK.  Then for those you can find who are claiming to do it, you find a few are calling it Data Mesh (e.g. Netflix), but it isn’t really, and there are a few who have done such a tiny element of it that it seems such a token gesture as to hardly warrant a discussion.

So if nothing else, Data Mesh is creating a lot of conversation. There is clearly much Data Mesh talk happening here and overseas within the technical data teams and less within the non-technical teams, which is understandable.  The excellent Data Mesh Users Slack channel has been created – and helped to flush out some excellent discussion and debate on key topic areas.  But are we still at risk of rushing the data community into focusing on a solution that, for the vast majority of companies, are not yet ready for.

Maybe we can learn from those controlling the budgets at the top table we need to hold our nerve before we rush to make strategic and tactical business cases for Data Mesh. Before we commit our limited budgets and resources, we need to be convinced of its credentials and impacts.

Data warehousing and Data Lake are far from perfect, but they have served us well thus far in collecting, hosting, processing, analysing, reporting, and storing data.  Are we really sure that focusing upon Data Mesh will remove all of the ongoing data challenges we continue to face every day, like Data Governance and Data Literacy.  Undoubtedly, it offers a way forward and might somewhat ease some of the problems in certain areas, but I suspect the underlying data issues will persist regardless.

For sure, Data Mesh is likely to be here to stay.  Indeed, it will undoubtedly find a place in our data technology toolkit going forward at some point as its understanding and knowledge of it matures, and some companies decide to take the plunge and invest in an unproven solution.

But lets as a community of data professionals not make a mess of it before we have even started and make sure we understand and see its practical application first.

Andrew Sharp is the Oakland Group Data Governance Lead