Nice data, shame about the quality

I have been known to attend one or two hack events. What is the most common feature mentioned? Of course it is data quality.

I was at one event a couple of years ago where one of the developers’ presentation was purely about the quality of the data supplied. Our friend Pareto would probably say that 80% of effort is cleaning up data, 20% is building something.

The same happened at Accountability Hack 21–22 November:


This is where I can hear my high horse trotting around the stable, pawing the ground, getting ready to be saddled up.

Clearly there are some transitory problems. Very few organisations are looking at processes involving data from end to end and considering wider data consumers as part of their data ecosystem.

Often this is because data is being generated as the by product of delivering services (‘exhaust fumes’), not delivering transparency or data. Clearly this needs to change.

If these processes change will this solve everything? Probably not. The whole point about hack events and external consumers needs is that often they are about being innovative and people wanting to do things differently. Unless there is a data crystal ball lying around this will always throw up challenges to the way data is presented.

However maybe some of these issues can be mitigated?

How about an internal data manifesto?

Data is our biggest asset

We are a data driven organisation/society

All our data is valuable

We should value our data

It is not our data as we hold it in trust for others

We are the custodians of our data for future generations

We must know how society wants to use the data we generate

Data is at the heart of what we do, how we operate and how we benefit society

Vanity thy name is publisher

Has anyone ever fixed the problem of vanity publishing?

I have come up with a couple of ideas.

There are some fairly obvious ones such as asking:

  • what are your success criteria?
  • how are you benchmarking?
  • what comms objective are you meeting?
  • what’s the business case?

Of course this does not tend to help with people who have got it fixed in their head that they must have an X or a Y.

There is another more punitive approach which I call the Refund response.

So you spend a lot of time building something which it turns out hardly ever gets used. What do you do? Ask for a refund.

How does that work?

I think as follows — work out how much time was spent on the project. Say a week or two weeks. Go back to the commissioner and say ‘ok we need that time back so we cannot do any more work for you for the next two weeks as we need a refund on the time we spent on your last project’.

I am looking forward to pilotting this very soon. Enough said.