Last time on the Data Knowledge Newsletter, we explored the many facets that have contributed to the problems of data asset entropy in the modern organization. From the explosion of tools to visualize and consume data, to the ever growing centrality of analytics teams, data proliferation has meant that data is more available than ever. However, those same factors have resulted in analytics disorder and sprawl, which have decreased efficiency and trust across the organization when it comes to actually using that data.
Today we will examine what analytics teams have done to overcome these challenges in the past, and suggest a new model for dealing with entropy using the right tools for Data Asset Management.
Even if the problems of entropy have accelerated in recent years, it has always been a problem to address. Data teams often think that their choice of an organizational model is a silver bullet to the problem of data asset entropy.
Unfortunately, one of the most common strategies to better enable an organization with data can be an accelerant of entropy and disorder: the self-service model. By enabling anyone in the organization to self-serve, operational teams are no longer reliant on tactical support from their data counterparts. This requires far less support from data teams and empowers end users to generate their own assets and insights. If done thoughtfully, it encourages agility, and a high degree of individual impact. It can also, however, be a multiplier for entropy.
Entropy grows with every tool and every report or dashboard, and if users are building their own assets—sometimes even duplicating the work already done by other users or teams—sprawl is inevitable. As a result, your organization could end up with thousands of assets, with no one in charge of maintaining the vast majority.
The opposite approach, the centralized control model, is probably the most direct buffer against entropy. If the analytics team owns and manages everything, things naturally can’t get nearly as far out of control. The obvious problem here is that analytics teams then become gatekeepers for the organization’s data, and the level of individual empowerment to make data-directed decisions drops off for the rest of the organization. This also increases information asymmetries between teams.
A nuanced approach between self service and control is where most innovative teams end up; but no working model in and of itself addresses the root causes of entropy.
Because the models themselves cannot solve entropy, most teams will end up employing strategies to identify which 20% of high profile assets are most useful to the most people and should therefore be curated and maintained, and which 80% are just noise and need to be managed out.
A more directed focus on the 20% has historically led data teams to implement strategies to manage and curate their most important assets:
The 80/20 approach necessitates understanding which data assets make up the less used and less valuable 80%, and how to take them to end of life. Common strategies include:
All of these solutions have a place in data management, but each has its own tradeoffs and shortcomings, and is unlikely on its own to solve the problem of entropy. It is clear that a more comprehensive solution is required.
"Without proper maintenance of those assets, without looking at a software development process or a product process too, and recognizing there is a cost to those things, you end up inevitably in a state of data chaos."
- Jamie Davidson, Co-founder, Omni
We would like to offer a somewhat different approach to solving this inevitable and fundamental problem of entropy: a data asset management solution like Workstream.io.
A data asset management solution is distinct from many of the other solutions we have discussed in this article—from data catalogs, to intranet solutions—in a number of ways. First, and perhaps most crucially, it is purpose built for the problem of entropy in your data environment. Rather than repurposing functionality from another tool, you are using the tool designed to do this work.
Second, a true data asset management solution is integrated with your modern data stack. It is not a new unique destination—and therefore does not contribute to the asset sprawl that is at the root of entropy. It is a single, integrated experience.
Finally, a data asset management solution solves the pain points experienced by both data consumers and data builders, addressing entropy as well as the knowledge asymmetries that both contribute to and result from entropy. These benefits accrue as a greater number of builders and consumers adopt and use the solution, as the breadth of available data and insights increases with every user.
The functionality required for a solution deployed to reduce the sprawl in your data environment is varied depending on the exact needs of your organization, but we have identified some key pillars that we feel are table stakes to any such solution.
Over the next few weeks we will be diving deeper into each of these pillars of data asset management. They all represent facets aimed at solving one part of the ever-present problem of entropy. We hope you will continue to dig in further to this subject with us as we look to unify your fractured data environment.
If you would like to learn more about Workstream, you can create a free account at app.workstream.io, or set up a demo.
Receive regular updates about Workstream, and on our research into the past, present and future of how teams make decisions.