Blog

Data management in the age of AI — Why it always comes back to the data quality?

Joel Vanhalakka Sales and Accounts Manager, Solita

Published 02 May 2025

Reading time 8 min

While the business world is obsessed with (or already sick of) AI, there is once again a growing demand for experts in the practices of data and information management. Why is this?

According to Gartner (via Forbes), 85 % of all AI projects will fail due to poor or lacking data. And if 75 % of managers already don’t trust their own data for decision making (Fersht et al 2024), how could they trust AI to make decisions for them? Without good quality data management (including data and business process modelling, data governance, common ways-of-working, master data management (MDM) and data quality controls) AI can learn wrong patterns, miss what is important and make unfair or unreliable decisions. Imagine a GPS directing you into a lake because it was trained on bad maps.

The results? Burned budgets that fail to scale towards business value. But we have seen it happen before.

Every few years, a new data and technology boom promises a new revolutionary change, only to remind businesses that poor data management and data quality can kill innovation before it begins. Let’s start by revisiting the big data era.

Big data (2010s): More data, more problems

The early 2010s saw a surge in enterprise systems like ERPs and CRMs, as companies aimed for centralisation. This was followed by a rush to build massive data lakes, believing more data would lead to business gains. By the mid-2010s, it became clear that raw, unstructured data was costly and ineffective without proper management, governance, and quality controls. The hype faded, and companies shifted towards MDM and data governance frameworks to bring meaning and reliability to data.

This was also when our data management community started forming. As one of the original pioneers put it:

“I was reassigned to the master data side for asking too many questions about the business and what the data was actually for during an enterprise application integration (EAI) project.”

The questions were correct, but they were asked in the wrong forum. The EAI project was in an IT-led phase. Questions about data purpose, ownership and business relevance are more meaningful when done together with business audiences than just with the IT department. 

And as data volumes were growing, these were exactly the kinds of questions our clients needed to be asking, too. These questions also capture the core DNA of our data management community. It isn’t just about storing and processing data, but about understanding what it represents, how it connects to business functions and how it creates value. It isn’t about managing the bits in space, but managing meanings, relationships, trust and common agreements on what the data represents in the real world.

The enterprise data warehouse (EDW) projects were clearly delivering huge value to our clients, enabling better analytics, reporting, and decision-making by consolidating data from various business systems into centralised repositories. However, as data amounts multiplied, so did the challenges. Businesses struggled with data inconsistency, duplicates and a lack of standardisation, making it difficult to trust or effectively use their data.

That is why we focused more on master data management (MDM) and information process modeling. When core data (like customers, products, or suppliers) is scattered, inconsistent, and owned by no one, MDM steps in to centralise, standardise, and govern it. While MDM tools help consolidate and manage this data, lasting impact comes from clear ownership, accountability, and data governance across systems. In practice, MDM also often leads to broader improvements in enterprise-wide data processes and governance. This is because clean, trusted data doesn’t happen in isolation from business.

These efforts complemented data warehousing and other IT projects, such as system migrations, by ensuring that critical data was already consolidated, standardised, high-quality, and readily available, making operations and analytics more reliable and efficient. At the same time, our communities around data management and data governance grew.

But, just as soon as the big data hype started boiling down, the next major hype wave hit: data science.

Data science (Mid 2010s-2020s): Searching for business value

The next wave, data science and machine learning, promised that predictive models would unlock the power of the future. Companies rushed to hire Data Scientists, expecting them to magically extract game-changing value from their increasingly expensive EDWs and data lakes. However, instead of building cutting-edge models, most of their time was spent on cleaning, structuring, and preparing data. Just to make it usable. 

As this wave also included a lot of modernisation of analytics and the rise of self-service business intelligence (BI), the data quality issues also became a lot more visible. Again, the lesson was the same: without structured, well-governed data, analytics and AI models fail, forcing organisations to prioritise data quality, governance and master data management.

This era also saw the emergence of low-code/no-code agile MDM tools, allowing organisations to build data management solutions in days or weeks instead of months. While technical execution became faster, success depended on collaboration between business and data teams — co-designing solutions, iterating prototypes, and ensuring real-world usability. One game-changer was the ability to develop data management applications so quickly that business experts could handle their own data from the get-go, reducing reliance on IT with data migrations and data management.

Our approach at the time was simple: we piloted a solution with limited data, and if approved, the business adopted it, migrating and improving data while we moved on to the next iteration. 

In this model, the data team built the applications, while business users owned the data. The result? A high-speed, continuous development cycle, where MDM solutions were deployed, tested, and scaled in real time. Our most common use cases included high-speed delivery of business partner management systems, customer consolidation tools, asset data management, and reference data applications.

So, while Project Data Engineers focused on EDW / data lake modeling, pipelines, and infrastructure, our Data Management Experts focused on the topics of data governance, data modeling, data quality, metadata, complementing core IT systems and structuring information architectures to align with business strategy. Speed alone wasn’t enough, we had to move in the right direction, preventing missteps. Poor data models, bad architecture, or weak governance led to long-term issues, costly rework, and compliance risks.

As our expertise evolved beyond MDM into broader data governance, metadata, and information architectures, we redefined ourselves as the Data Management Cell — not just a name change, but a recognition that our core work was helping organisations make sense of and govern their data effectively.

Data democratisation & GenAI boom (2020s-Present)

The early 2020s were the golden age of data management. The data science trend had left a strong memory mark that data quality and data management capabilities were something to invest in and nurture. 

Many companies already had robust EDWs, MDM solutions and complementary data tools such as data catalogs. New concepts such as data mesh and data products emerged, aiming towards distributing data ownership across business domains, making data more accessible and usable across organisations. At the same time, AI and automation had matured, providing new ways of streamlining workflows, reducing manual data handling and enhancing decision-making.

However, new challenges also started to arise. As data has become pretty much a part of every decision, interaction and process, so has the ecosystem of tools grown over the recent years (Modern Data 101 2025). Therefore, what companies need and are looking for is ways to simplify their data ecosystems and find focus on what creates value instead of all the bloat.

In addition, core IT systems are constantly evolving. System integrations and migrations are more frequent than ever, driven by version upgrades, acquisitions, and simply, the sheer number of systems that need to be integrated. For organisations with MDM as part of their core architecture, these transitions are smoother and less disruptive. 

However, organisations without MDM can struggle with major migration efforts, data inconsistencies, and operational risks. At the same time, market uncertainty and the mentioned bloated data landscape are making companies hesitant to invest in new data tools, further delaying much-needed modernisation.

And finally, we arrive at the rise of GenAI, which is the part of the wave we are now riding. By now, nobody is doubting that GenAI and large language models (LLMs) like GPT-4 are transformative technologies. Their potential for automation, augmentation and decision-making is unprecedented. This comes from first-hand experience. However, as I stated at the beginning, AI is only as good as the data it is trained on. Garbage in means garbage out at scale. 

If organisations really want to get value from GenAI and LLMs, they need to start treating data quality and data governance as mission-critical. Not as one-off projects triggered by the next data catastrophe or hype cycle.

Final thoughts

Despite all the advancements in AI, automation and data platforms, the key challenges in data management haven’t changed, they’ve just scaled. Without structure, orchestration, and a solid foundation built upon the core fundamentals of data management, more technology just means more chaos and faster. 

This blog post just mentioned a few of the core trends leading to the data quality push, but the picture below already highlights to common trend across the eras: data quality is important.

Evolution of data management and data quality

So, as the picture paints, by 2030 or perhaps sooner, we hope to see companies finally treating data quality as mission-critical. However, that will still require another wave of successes and failures with the novel tools and concepts available.

Yet, there is a silver lining: GenAI and LLMs are proving to be some of the most powerful tools for data management. They can massively accelerate data processing, enrichment, transformation and structuring, while also enabling faster prototyping and automation. In fact, Gartner predicts that by 2027, the application of GenAI will accelerate time to value for data & analytics governance and master data management programs by 40% — meaning that AI isn’t a replacement for data management, but an enabler (Gartner 2024). LLMs could also remove the need for no-code shortcuts. If AI can generate SQL, Bash, or Python with precision, why not leverage industry-standard languages that offer scalability and control?

At the end of the day, it all comes back to data management fundamentals. Without a solid foundation in data modeling, business process modeling, and scripting, no AI or tool will fix data chaos. LLMs won’t help unless you know how to use them. The real magic only happens when a skilled information architect teams up with business experts to guide AI with intent. You can’t supercharge your data management work with AI unless your fundamentals are in order.

References:

  1. Business
  2. Data