The city should work with interested parties to develop the ‘open by default’ policy and practice and while that is being developed, the following aims can be adopted:
A body looking to release data can look to a number of sources to determine data to release:
- Public bodies can use Freedom of Information requests as a simple method to determine demand, respond by publishing the data for others to use.
- Setting up an online forum to capture demand for example: https://birmingham.dialogue-app.com/open-data-for-birmingham
- Auditing existing publications of data that form part of statutory returns to central Government
- Auditing data published on the website used by the public sector body where the data is currently only available in PDF and PowerPoint reports
- Work with internal stakeholders who are champions of open data and therefore are keen to see their data published
- Work with local lobbying groups who exist to support releasing data and actively engage with their local communities, in the case of the West Midlands Open Mercia and the West Midlands Open Data Forum
- Building in open data release to projects as part of the funding agreement e.g. Birmingham is beginning to release Transport data as part of the OpenTransportNetwork EU funded project.
The body can also identify relevant and appropriate training to address the different needs officers and members have around open data.
The task of releasing data is not always smooth or uniform but there are tools and techniques that can be followed to ease the process, build trust and speed up release.
These processes need to support the organisation’s policy and strategy as in the Birmingham example. Birmingham’s open data strategy includes a section “How We Will Do It” that outlines the use of a data platform, extraction processes and the use of standards.
- How to find data (audit your organisation, ask data users, catalogue what you find)
- How to prepare data (remove personal information, remove confidential information e.g. commercial information, data affecting national/local security or enforcement)
- How to check for other sensitive fields e.g. comments, free text, hidden fields
- How to ensure the right license is in place (3rd party data, licensing)
- How to best describe data (choosing the right file name, linking data, providing metadata)
To ensure the organisation is willing to release open data it is important to identify key stakeholders and bring them on board. This can be done by:
- Identifying data controllers, processors and subject matter experts
- Working with such data holders to agree a sign off process incorporating a form determining the metadata, licensing, field description, and a report highlighting the appropriate aggregation level to release data ensuring anonymity for any personal or sensitive data.
- An example of such a form can be found in Appendix C
To understand how to ensure anonymity, Birmingham has developed a process for location based data:
- Take raw data in an internal legacy system, where there is a location address
- Determine with the data owner if the release of the most granular data is sensitive
- Run the data through a series of aggregation processes, placing the addresses in ONS statistical areas, Wards, Districts or the whole city area.
- A report is created that determines the number of areas where there are 5 or less locations (i.e. individual instances). Therefore the locations could be triangulated which could make individuals / specific assets identifiable (e.g. homes with vulnerable inhabitants).
- With agreement of the data owner, the data can be released at the appropriate aggregated level; this mitigates the fears data owners may have over useful data that could be seen as sensitive
- The automation process then uses this aggregated process to regularly extract and load data onto an open data platform or area
An example of the output of the above process is here: https://data.birmingham.gov.uk/dataset/social-housing-aggregated
Automated publication is the preference to keep data up to date and minimise the impact on business resource, but crucially to ensure that data is maintained and kept relevant.
However this is dependent upon internal ICT approaches and skill levels. If this is not possible then the relevant information owner can upload the data themselves.
The use of Transport data is best served by automation in order to meet the demands of the quantity of the data, the fact that much of the data can be real-time and also it is easily anonymised with existing business models in place that can exploit it. Nationally the Transport Systems Catapult is collating the cataloguing of available datasets: http://imdata.co.uk/
Birmingham is using a combination of data extraction tools and geographical information tools to automate the publishing of data and transforming it into something a little more useful. As an example Birmingham is publishing the data from the car park availability signs to the open data platform. Clicking on the map button visualises the data:
This is an example of good practice and should be part of the Open by default process.
SharePSI best practice: Open Up Public Transport Data
There are already a number of free courses and tools available on the web to learn about open data. In addition, many local councils offer access to internal eLearning portals that often include courses on information governance, business analysis and other complimentary skills to make the most of data. These could be used for self-learning or in a more structured approach e.g. through a training needs analysis for the whole team. Example resources:
The open data release strategy should specify the aim of releasing data under as open a licence as possible to encourage uptake and reuse. The Open Data Institute provide excellent advice on this: http://theodi.org/guides/publishers-guide-open-data-licensing
Working with spatial data in the UK will inevitably mean dealing with Ordnance Survey derived data. Fortunately the OS have moved to a more permissive model and open licensing in the last few years. There are many examples of data being released by different bodies setting the precedent of what can be released. In addition a public body can apply for OS exemption to confirm the data they are releasing does not breach any Public Sector Mapping Agreement Rules.
Comprehensive information can be found here: https://www.ordnancesurvey.co.uk/business-and-government/help-and-support/public-sector/guidance/index.html
SharePSI best practice: W3C Best Practice 5 – Provide data licence information
Ideally any open data release will adhere to data standards appropriate to that sector – this encourages reuse and uptake, in particular supporting citizens and businesses to develop tools and techniques that cross geographical borders.
However each sector is at different stages of development, in addition to this being a fluctuating area of data development, but this should not be a barrier to releasing data. Where no standard exists, release the data and then attempt to work with industry bodies and standardisation projects (e.g. funded by Innovate UK or EU funds) to define standards.
It is worth investigating a number of areas for UK and international data standards:
- Local Government Association, http://opendata.esd.org.uk/
- City level data is moving forward under the smart cities banner:
- City comparison data – http://www.dataforcities.org/
SharePSI best practice Standards for Geospatial Data
[i] Version accessed and available on 20/07/2016