Lately, companies and other organisations have been frantically discussing what the breakthrough of machine learning and artificial intelligence will entail – how the operations of the organisation, and possibly their entire business model, may have to be modified using data and analytics. There is no doubt that any organisation that wants to stay competitive cannot afford to be left out of this. Analytical algorithms and models are structurally completely useless without correct information, that is in the correct format and readily available.
Corporate data is typically scattered – who will bring it under control?
Building a business intelligence culture and developing new digital services and business with scattered data is difficult. For a long time, many companies have attempted to solve the problem using data warehousing, and some have reached a decent standard. Even so, the rise of data and analytics into business operations has taken the challenges of data capital management to new heights.
Too many of the current analytics solutions and experiments have been singular gimmicks and hand-coded solutions without a sustainable and scalable base. These have been important learning experiences, as their main aim has been to test and systematically discover the next steps required in a complex environment.
The end game challenge for organisations is how they can create the type of sustainable data capability needed for quickly and continuously producing business value and flexibly developing their operations.
Two approaches, at opposite ends of the spectrum, have dominated the building of analytical data capability. The first style involves implementations rushed to meet business needs, creating a technical deficit, or at worst doing the wrong thing entirely. The second style has been to use a solid architecture and a method that is expected to enable easier value creation in the future. This expectation has been met in very few cases, and the implementations have been much too slow.
Although this could be compared to a race between a tortoise and a hare, it is very difficult to say if either of them has in fact won anything in this case.
Focusing on data – is traditional data warehousing the solution?
The traditional data warehouse has mostly played the role of the tortoise when it comes to harnessing data capability. Data warehousing is often seen as extraneous “canning” of data that creates no value and takes ages to implement. Much data warehousing has also been done using quick and dirty methods where instead of creating a proper data warehouse, each manager and business need has received their own “can”, a data silo.
There can be many reasons for why data warehousing fails or takes too long. First of all, the reason might be operational management: how software was procured, and how work was resourced. The gap between business and implementation team may have been too wide, and the communication between the data warehousing specialista and the users may be funnelled and hence distorted. If calendar days are mostly spent waiting, efficiency goes out the window.
Another potential reason for failures is using the wrong tools. They may have steered the development of data warehousing in the wrong direction. These tools are typically designed to offer versatile and seemingly simple functionality for an individual data specialist. These software products and the software houses selling them have directed data warehousing professionals to think and work in a certain way that is now insufficient for implementing the current business needs effectively. The tools and the operating methods will have had a life of their own, with little or no collaboration or architecture in sight.
To sum it up, I would say that data warehousing has earned a bad reputation, but for the wrong reasons.
The problem here is not data warehousing itself, but the methods that have been used. The focus has been on doing and optimising the wrong things.
Focusing on dev – is quick coding a fast track to happiness?
So where does the swift hare come into this story? An updated, faster way of building data capacity has been introduced alongside traditional data warehousing, thanks to cloud platforms and new technologies. Now we make “data pipelines” using existing cloud services combined with tailored code. The best coding talent and data scientists have been able to use this model to create business value quickly, and the success of many companies is built on such capability, which was often built with heavy investment.
These circles rarely mention data warehousing, as they wish to clearly distance themselves from it. This is in part due to the poor reputation of data warehousing and in part due to simple ignorance, as this crowd may not include experienced data professionals, allowing the “not invented here” syndrome to thrive. This typically leads to repeating mistakes that were first made 20 years ago. There are some basic principles of information management that have not changed.
This style has introduced many good things in the area of data capability development. Good practices and tools have been brought over from the software development tradition, such as automation and the DevOps culture.
People are the weakest link in creating data capability
Both extreme approaches have the same end result: excessive costs compared to the value gained. There are lost business opportunities in the short term due to slowness or out of control life cycle costs and realised personal risks in the long term. The basic problem of these approaches is the high dependency on individuals, their skills, different adopted ways of doing things, and the constant reinventing of the wheel.
I do not believe we want the new business based on data capability to be an oligarchy. If an algorithm or the ETL process is only understood by a single individual, the business will soon be at the mercy of the organisation’s most capable data wizard – or even an outside consultant. The operations must be transparent and understandable for the business side, developing the skills of the development team instead of a single individual.
Operating methods and architecture require adapting to create sustainable data capability
The solution for creating sustainable data capability is to combine the best technological aspects of both ways with the operational culture. The operational focus should be shifted to optimising the throughput of the whole data pipeline to genuinely enable agile development.
We at Solita focus on developing software and methods for optimising the building of data capability with a view on the entire chain, serving the needs of both analytics and traditional reporting. The customer need not compromise in systematic information management to serve their business expectations quickly enough. At the same time, we aim to make work more meaningful for every data professional by automating routine tasks.
Agile Data is a path that will shake up the common operating models of the marketplace. We ask that you join us on this journey. You won’t be disappointed!