A typology of caveats
We’re continuing to investigate the type of contextual information that statistics need in order to be meaningful and used correctly.
The work started with a review of 17 statistical releases to see what range of contextual information is commonly given. The review confirmed our thoughts that there is a multitude of different pieces of information that can be (and are) communicated to users...
A methodological change, a new government policy or a change in geographic coverage all affect how a user interprets the data.
At the moment, this context is communicated inconsistently; a piece of information may be included as a table footnote in one statistical release, whilst being explained in a separate quality document in another.
We’ve now compiled a typology of caveats. Of course, the list potentially could be endless and will evolve over time, but our typology encompasses the most common (and important) occurrences of caveats surrounding the datasets covered by our review.
Alongside each caveat type, there is a list of fields which would need to be completed for a user to have the full information about that particular caveat.
So, if we wanted to tell users of the national travel survey that in 2013 the survey coverage changed from covering those in Great Britain to those in England only, we would complete the fields of the meth_cov (change in coverage) caveat type as:
Or to share the fact that the homicides of Dr Harold Shipman affected the 2002/03 homicide statistics (which are recorded by the police as offences 1, 4.1, 4.10 or 4.2) we could code:
Ultimately, we hope that all caveats surrounding a dataset would be captured in this way as metadata behind the raw data in a machine-readable way. Armed with this supporting information, users can be more confident that they’re using the data well.
If you have any comments or suggestions on the typology, or on the project in general, then please do get in touch with us at firstname.lastname@example.org.
Update (21 August 2015)
We've also been working together with the Office for National Statistics (ONS) and the UK Statistics Authority on this project.
Update (5 September 2015)
This can now be found in the ONS high level roadmap on Trello.