Dark data, Dirty data

Does your organisation know the value hidden in dark data and, equally, the risks lurking in your dirty data?

It’s the 23rd March 2003 and the Academy Award for Best Animated Feature has just been won by an animated Japanese fantasy film, Spirited Away. In the imaginary story there are small, unfamiliar characters called Makkuro Kurosuke. In the English dub of the film, they are referred to as ‘Dust Bunnies’. These elusive, sooty creatures are tennis ball-sized, pitch-black, and fuzzy-haired beings with two large eyes and long, thin limbs. They move by hovering around and can extend stick-like limbs from their bodies to carry out specific tasks or lift ultra-heavy objects.

Hold that thought, park the fantastical imagery - and the 2003 nostalgia - and let's shift our focus for a moment. Looking into the domain of organisational data management, we note that in recent years two prevalent and important concepts have arisen relating to what are called ‘dark’ data and ‘dirty’ data.

Dark data is defined as the data that organisations collect, process and store as an outcome of business-as-usual activities, but fail to identify, categorise, or even use for actionable business insights. Conversely to this, dirty data is data that you already know you have, but that is clearly incomplete, incorrect, contains unreadable content or is duplicated in multiple data stores, in multiple forms. In short, the kind of data you don’t really want.

Let’s revert back to Spirited Away for a minute. As the narrative events of the story unfold, the lead character Chihiro befriends several dust bunnies after learning that if they are not given a job to do, they turn back into soot. It can only be analogous then that like the dust bunnies, dark data should be given a job to do and dirty data should be fixed, and fixed fast.

While it is logistically expensive to store large amounts of dark data, lost revenue opportunity is by far the larger issue. Experience has proven that if this data is analysed, categorised, and mined, it contains hidden treasures in the form of insights that can be turned into actionable and revenue generating business initiatives.

Dirty data is staggeringly risky for a financial services organisation to use in day-to-day operations because it inherently creates transactional, operational and compliance risks. Research has indicated that inaccuracies could be costing firms between 15% and 25% of annual revenue. With global banking revenues of over US$2.2 trillion, this means that dirty data may be costing the global banking sector over US$400 billion a year. Enough said, right?

In our Client Lifecycle Management (CLM) advisory work, Aurora SDE often encounters situations where the CLM journey can be markedly improved, both tactically and strategically, by understanding and then utilising dark data and/or fixing dirty data. With this pursuit of happier banking customers in mind plus increased shareholder revenues, better utilised staff effort and a massive reduction in compliance and regulatory risk, we strongly recommend an immediate focus on your neglected data dust bunnies.

We also really recommend watching Spirited Away if you haven’t already seen it.

Dark data, Dirty data

Written by

Published on

Under the category

Keep Reading

Delta Capita announces the acquisition of Aurora, a global consultancy, headquartered in the UK with specialist advisory knowledge in client lifecycle management (CLM) organisation design, process optimisation and technology.

Aurora CLM

DORA is coming, but are you ready?

Rebecca Grant

Generative AI - The CLM Story

Rohan Toor

Get in Touch

Book an appointment today