Callum Irving, Senior Data and Standards Advisor, shares the Geospatial Commission’s ambitions to improve access to better location data by making it Q-FAIR - Findable, Accessible, Interoperable, Reusable and of the right Quality that is fit for purpose. Building on the work of the geospatial Data Improvement Programme, the Commission is now launching its Q-FAIR benchmarking process that will drive and track progress across the public sector to make better location data available to more people. This blog is the first in a series that sets out the successes and lessons of the Data Improvement Programme, which the Commission has run in conjunction with its six Partner Bodies, and its Q-FAIR ambitions for the future.
As I begin to write this blog I am on a train going 125mph hurtling through the great British countryside. Through the wonders of data and technology, I can communicate with colleagues, check my location, and estimate my time of arrival. This allows me to know how much, blog writing time I have before arrival at my destination, and answer the crucial question of whether I have time for another coffee….I do.
This simple use of geospatial data helped me plan my day, but location data is used to support much wider decisions about our environment, our world. One of the most important questions decision-makers will always ask is where?
The UK’s geospatial data and our geospatial capabilities are first class. We have long known the importance of location data to the public, as was aptly demonstrated in 1854 from Cholera and John Snow’s infamous water pump map, through to today as we have seen similar foundational techniques used to navigate the tricky geo-scape of COVID 19, in conjunction with cutting edge technology, data analytics and expert practitioners.
The UK public sector holds and uses vast quantities of valuable geospatial data that shapes and improves our lives, in often unseen ways. The Geospatial Commission was set up to ensure that the UK realises the potential economic, social and environmental value of this data. Our six Partner Bodies - British Geological Survey, Coal Authority, HM Land Registry, Ordnance Survey, UK Hydrographic Office and Valuation Office Agency - have location data at their core, and this allows them to innovate and plan for the future of that data.
Our future is Q-FAIR
Working closely with our Partner Bodies and Defra, the Commission funded a Data Improvement Programme that sought to tackle some of the shared data challenges faced by each of the organisations. The term ‘data improvement’ is very broad, so we adopted the FAIR principles to help structure and communicate our aims; to understand what is needed to make the data more Findable, Accessible, Interoperable and Reusable (FAIR). Using this approach also emphasised the importance of, before you do anything else, ensuring the data is of appropriate Quality and is fit for purpose. In short, improvements to data must be Q-FAIR.
In our Data Improvement Programme, we went beyond the Q-FAIR acronym to establish requirements and recommendations associated with the Q-FAIR principles which we called the Q-FAIR Framework. This helped us to establish a holistic view of improved data supply from the perspectives of the data itself, the organization creating the data and the organizations as collective suppliers to the UK.
Applying the Q-FAIR Framework in our Data Improvement Programme, therefore, gave us a benchmark of our public sector geospatial supply and enabled us to deliver some significant improvements to the foundations of geospatial data held by our Partners. This has included:
- Developing a harmonised Data Exploration License for our Partner Body data assets to enable anyone to freely access their data for research, development and innovation purposes
- Establishing a core set of Partner Body licensing principles for ongoing review using a shared language and definitions, to improve licence compatibility and allow end-users to more easily combine data
- Identifying and selecting standards for Partner Body metadata records and catalogue services to allow greater interoperability between datasets
- Establishing a data-sharing agreement between the Partner Bodies to improve access and interoperability of data, and structures
- Building an alpha service and testing the concept of correlation relationships, a service that would allow an authoritative join between two datasets to be shared and published
- Publishing best practice guidance around archive data capture and digitisation, to support considerations for further work to explore the use of new technology (such as machine learning and natural language processing) and crowdsourcing as means to extract useful insight from archived data more efficiently.
- Improving our understanding of the UK’s coastlines through a project focused on understanding the data landscape of agencies involved in the collection and use of geospatial data in the coastal zone.
We will be publishing more on each of these initiatives in the coming weeks along with our initial Q-FAIR benchmark of the state of data supply.
Through the data improvement programme, we saw the clear benefits of using the Q-FAIR Framework as a common language between the Geo6, the Geospatial Commission and our wider stakeholders. It helped to map (if you excuse the pun) the context for data improvements to the identification and realization of the specific actions to bring improvements about.
Building the ambition
We now intend to expand the scope of our ambitions for improved UK public sector geospatial data. Based on the learnings from the DIP we are developing our Q-FAIR Framework through a new programme of work. This programme aims to develop a systemic and sustained approach that will advance the supply of the geospatial data managed by the public sector now and in the future. This will be achieved through an objective benchmarking against Q-FAIR principles linked to business plans, standards and the regulatory framework to bring improvements about. We will start with our six Partner Bodies and build-out across the public sector.
This work will not only help us to set our priorities for the future, but it will also enable us to track our collective progress, aligned to national and international initiatives such as the UNs Sustainable Development Goals and Integrated Geospatial Information Framework (IGIF)
The right improvements, to the right data sets, at the right time, and with the right connections to other data, has so much potential. It will allow us to make the right decisions that will create economic value, enabling the levelling up agenda and serving the country's needs in an intelligent and data-informed way. Resulting in more services that are better informed by location and are tailored for the public across the country wherever you are. We are looking forward to going on that journey with you!
Sign up to this blog to get an email notification every time we publish a new blog post. For more information about this and other news see our website, or follow us on Twitter and LinkedIn.
5 comments
Comment by Situl Shah posted on
Nice article, especially highlighting the practical value of using Geospatial Data in our everyday lives.
Also, the fact that this feeds into overall Data Improvement, further demonstrates the strategic value of GS Data, which combined with increased quality helps drive the UK at the forefront in this area.
Comment by Hayden Sutherland posted on
Thanks for posting this blog Callum.
We in the Transport & Mobility sector have been following a similar path to data maturity and are keen to ensure our data is also F.A.I.R
https://opentransport.co.uk/2021/04/22/make-transport-mobility-data-fair/
It would also be good to work to work on common data Findability / Discoverability approaches and definitions across the Data Spectrum.
Comment by Don Keefer posted on
I like the idea of integrating Q into FAIR. Quality is clearly a critical missing piece for improving the general value of data and will be helpful to ensure agency-level data managers are trying to talk the same language. However, as opposed to the fit for purpose (quality) of a data set for demonstrating trustworthiness, the fit for purpose of a data value is application (and hence, scale) dependent. This article seems to be written to the data-set level of fitness. How do you plan to avoid miscommunicating these generalized concepts of "correlation relationships" and "authoritative joins" to data users and stakeholders? (In fact, are you planning to evaluate and ensure conceptual and semantic alignment among any selected and authoritatively-joined, data sets? If not, how will you help data users identify when they need to be aware of data that may be misaligned in concepts or scale (i.e., semantics)?) While these high-level ideas are important to data users, they are arguably more relevant to the different agency-level data managers. Will you also be able to help data users understand their application-specific needs, and guide them to identify goal-specific, fit for purpose criteria; to recognize and define the scale dependencies of their application goals; to recognize the potential for misalignments in scale between authorized data and application goals; and, to make sure they assemble, align, and evaluate the fitness of compiled data relative to the project goals they will be addressing? The distinctions between the low-level concepts and the equivalent high-level concepts seems to be missed in the high level discussions here.
Comment by pauladorman posted on
Thank you, Don,
Some really helpful insight here and food for thought. "the fit for purpose of a data value is application (and hence, scale) dependent." I tend to agree with this comment and it's something I think about a lot, from my engineering background a good example of this is the LIDAR data and its many varied uses.
" Will you also be able to help data users understand their application-specific needs, and guide them to identify goal-specific, fit for purpose criteria; to recognize and define the scale dependencies of their application goals; to recognize the potential for misalignments in scale between authorized data and application goals; and, to make sure they assemble, align, and evaluate the fitness of compiled data relative to the project goals they will be addressing?"
I'm not sure I follow entirely. The intention is to first ask the question and get users thinking about fit-for-purpose, applications, and the data lifecycle of their data outside its primary use and immediate use on a project. That being said the further you use data away from its original project goal and primary use the less fit for purpose it becomes (in general), but I agree that a goal-based approach to quality centred around long term project goals and outcomes might be one way of going about it, and I think you rightly identify the need for some guidance in this area, the question as always is what's the best approach/way of doing this and is this something we do at the Geospatial Commission level or suggest through quality management for individual organisations.
Comment by Iain Paton posted on
QFAIR is very interesting and a good model but the Q will inevitably end up unbalancing the FAIR! So there needs to be some way of summarising and visualising data quality in fit-for-purpose terms, that complements the metadata. There's also a longer-term linked data/semantic web environment going beyond data/metadata that has some niche areas of application (Sparql/RDF) as well as hybrid solutions that are filling the gap.