How much data does your organisation collect? What is its real value?
Why is it being collected?
Is it being used? For what?
Who can access it and for what reason?
Where does it come from?
All pretty important questions that a data driven organisation should have answers for. Yet documenting data is one of those things that makes it’s way to the bottom of everybody’s to-do list.
Documenting datasets isn’t just something to help out your future self (that mug never seems to get a great deal anyway).
Most reports estimate at least 80% of data specialists’ time is spent finding and cleaning data. A lack of data documentation, aka metadata, means you spend half your time explaining why something you created years ago is how it is. The rest of you your time seems to be spent trying to pin down someone to explain their data. Only to find out they’re on a sabbatical trekking around the Himalayas.
If your data scientists are spending four days a week simply trying to find, understand and clean data, documenting data suddenly seems like something worth investing in.
And given it’s something you’re required to do to fulfil your legal obligations as a data custodian, it may be worth the bother after all.
A metadata strategy is really just a fancy way of saying that an organisation is trying to make sure their data is well documented. Why? So that it is discoverable, understandable and accessible.
There are various types of metadata you can use to document your data. A good place to start is descriptive metadata, as you can add a lot of value by simply describing the key features of a dataset. Good descriptive metadata provides human readable descriptions that make your data easier to discover, understand and use. There are a couple of good standards that are worth looking up if you’re embarking on your metadata journey:
And if you’re in healthcare, there is consensus around a new profile created in collaboration with HDR UK, that is worth checking out.
We’ll be writing a series of articles about some of the key features of descriptive metadata over the next few weeks, including attributes such as “Abstract”, “Digital Object Identifier”, “Provenance”, “Coverage”, “Observations”, “Linkage” and “Accessibility”. But each organisation will have different discovery and metadata requirements. So if you want to talk through some of them with the team, please get in touch