ISE18/15-16

Subject: information technology and broadcasting, e-Government


What is open data?

  • There are currently many definitions of open data. For example, data.gov.uk7Legend symbol denoting In the United Kingdom, data.gov.uk is a government project aiming to make available non-personal government data as open data. defines open data as the "data that is published in an open format, is machine readable and is published under a licence that allows for free reuse". Alternatively, Open Definition8Legend symbol denoting The Open Definition was created by the Open Knowledge Foundation in 2005. The foundation is a global non-profit network that aims to promote open data and content. defines open data as data that "can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike".9Legend symbol denoting Generally speaking, attribution means that the data user must credit the data publisher for the original creation, and sharealike means that the data user must license their new creations under identical terms as the original data. For the World Bank, data is open if it is (a) available in a machine readable standard format; and (b) explicitly licensed in a way that permits commercial and non-commercial use and reuse without restrictions.
  • While no precise definition of open data is in place, a high-quality open dataset generally possesses the following five characteristics:

    (a)availability - a wide range of users is permitted to access the data;

    (b)cost - data should be accessed free or at no more than a reasonable reproduction cost;

    (c)machine readability - the data must be available in machine readable formats, such as CSV, JSON, XML, that can be processed and analysed by computer;

    (d)rights - limitations on the use, transformation, and distribution of data are minimal; and

    (e)interoperability - the data must permit intermixing with other data to allow for the development of more and better products and services.
  • Open data varies in scope and source. They can be local, regional, and global in scope. In addition, open data can be obtained from commercial and government sources. Many organizations collect a broad range of data in order to perform their tasks. Government is particularly significant in this respect because of the large amounts of data on a range of topics it collects in its day-to-day activities.
  • Open data is sometimes related to big data, but they represent two distinct concepts. The fast development of information technology and increased usage of Internet in recent years have generated unprecedented volume of digital data. The vast amount of raw data so generated across society, and collected by commercial and government organizations, is known as "big data". Big data is often defined along three dimensions known as 3Vs: (a) volume (large amounts of data), (b) variety (high variety in data sources and formats), and (c) velocity (occurrence and analysis of data in real-time or near real time).
  • Against the above, "big data" refers to a dataset that is voluminous, diverse and timely, while open data describes how accessible a dataset is in terms of the five characteristics mentioned above. Furthermore, big data features processing very large datasets to identify patterns and connections in the data. For open data, it involves publishing machine readable data that people, companies and organizations can use, reuse or distribute it with no or minimal restrictions.

Value of open data

  • The use of open data is a relatively recent phenomenon with the earliest open data initiatives launched in the United States ("US") and the United Kingdom ("UK") around late 2000s.10Legend symbol denoting In the US, President Barack Obama issued a Memorandum on Transparency and Open Government on his first day in office in January 2009. In the memorandum, Obama committed that his administration would work together to ensure the public trust and to establish a system of transparency, public participation, and collaboration. This was followed in May 2009 by the launch of the US open data portal, Data.gov, with 47 datasets. The UK was also active around this time and Prime Minister David Cameron launched the UK open data portal, data.gov.uk, in 2010. In recent years, open data has been growing in relevance and prevalence, as more than 70 countries have now made their data available.11Legend symbol denoting See Economist (2015). This development is attributable to, among other things, technological advances such as cheaper memory storages and the exponential growth of digital information. Heightened focus on data-driven analysis and decision-making also plays a part in its rapid development.
  • It is also believed that the widespread use of open data will deliver a range of benefits to governments, citizens and organizations alike. For example, by releasing PSI as open data, governments can ensure that their actions, including spending practices, are transparent. This can, in turn, help reduce waste and facilitate government accountability. Releasing more open data also enables the public to better understand about how the government operates, thereby improving public trust and boosting community engagement. 
  • In addition, releasing PSI as open data also stimulates innovations which help improve efficiency of government services. For example, in the UK, a joint venture of technologists and doctors has made use of open data to unveil the overspending on prescription drugs by the publicly funded National Health Service ("NHS").12Legend symbol denoting See Prescribing Analytics (2016). In 2012, an analysis was conducted to examine written prescriptions of family doctors in England and their regional patterns in the prescriptions of statins (drugs used to lower cholesterol levels). It was found that there was a big local variation in prescriptions of the more expensive branded drugs and cheaper (but equally efficient) generic drugs. The joint venture estimated that, had every doctor prescribed an equally effective generic drug, the drugs bill incurred by NHS might have been reduced by more than £200 million (HK$2.5 billion).
  • Open data may also generate economic impacts that benefit both consumers and business organizations. According to a study conducted by McKinsey Global Institute in 2013, the use of open data has the potential for generating more than US$3 trillion (HK$23.3 trillion) in economic value globally each year from seven sectors, namely healthcare, transportation, consumer products, consumer finance, education, oil and gas, and electricity. The McKinsey report also identified the ways in which the use of open data would bring value to both the public and private spheres. These include:

    (a)healthcare - enabling services providers to determine the most timely and appropriate treatment for patients and ensure cost-effectiveness of care;

    (b)transportation - allowing time saving for individuals from using open data to make better decisions about which mode of travel to use and when;

    (c)consumer products - helping (i) manufacturers and retailers to better target customers for marketing and sales by using open data from social media or neighbourhood demographics; and (ii) consumers to make more informed purchasing as open data enables price transparency and access to other product information (e.g. the provenance of packaged food);

    (d)consumer finance - giving consumers up-to-date information on the costs of a host of financial products from the updates on fees for mortgage, retirement plans, credit cards and other consumer products published by third-party data providers;

    (e)education - improving instruction by using data on student performance and learning styles to design and personalize lessons suited to individual skills and learning styles;

    (f)oil and gas - allowing companies to make use of publicly available geological data as well as projections of oil and gas to improve their investment decisions about where to explore new reserves and build downstream facilities ; and

    (g)electricity - facilitating residential and business users to make better decision about which appliances and equipment to buy or what electric service to use by using open data that makes it possible to compare products and services.

Overseas experience

Concluding remarks


Prepared by Samantha LAU
Research Office
Information Services Division
Legislative Council Secretariat
20 May 2016


Endnotes:

1.Open data is conducive to the build-up of a smart city. For a detailed discussion of smart city, please refer to the Essentials entitled "What is a 'Smart City'?" published by the Research Office in March 2015.

2.OGCIO was set up on 1 July 2004 to provide leadership for the development of information and communications technology within and outside the Government.

3.In 2011, OGCIO launched an 18-month pilot scheme to make available geo-referenced public facilities data and real-time traffic data for free download and value-added reuse by the public. The relevant data was provided via a portal, entitled Data.One.

4.The revamped portal also provides enhanced search capability as well as new functions, such as charting or mapping datasets directly on the portal.

5.The Global Open Data Index is an annual effort undertaken by the Open Knowledge Foundation to measure the state of open government data around the world. The ranking is based on the availability and accessibility of data in 13 key categories, including government statistics, government budget, legislation, government spending, election results and procurement tenders.

6.Taiwan improved its respective rankings of 36th in 2013 and 11th in 2014 in the Global Open Data Index.

7.In the United Kingdom, data.gov.uk is a government project aiming to make available non-personal government data as open data.

8.The Open Definition was created by the Open Knowledge Foundation in 2005. The foundation is a global non-profit network that aims to promote open data and content.

9.Generally speaking, attribution means that the data user must credit the data publisher for the original creation, and sharealike means that the data user must license their new creations under identical terms as the original data.

10.In the US, President Barack Obama issued a Memorandum on Transparency and Open Government on his first day in office in January 2009. In the memorandum, Obama committed that his administration would work together to ensure the public trust and to establish a system of transparency, public participation, and collaboration. This was followed in May 2009 by the launch of the US open data portal, Data.gov, with 47 datasets. The UK was also active around this time and Prime Minister David Cameron launched the UK open data portal, data.gov.uk, in 2010.

11.See Economist (2015).

12.See Prescribing Analytics (2016).

13.In 2015, the UK lost its first place it held on during 2013-2014 to Taiwan.

14.The Government Open Licence is a simple set of terms and conditions under which information providers in the public sector license the use and reuse of their information with minimal restrictions. There is no need to register or apply for an Open Government Licence. Users simply need to ensure that their use of information complies with the Open Government Licence terms.

15.The "Five Stars of Openness" was developed by Tim Berners-Lee, inventor of the World Wide Web. It ranges from one star (simply making data available on the web in any form with an open licence) to five stars (linking the dataset to other existing datasets on the web).

16.According to Professor Nigel Shadbolt, "[f]or data.gov.uk, it wasn't enough just to establish a single point of access and then populate it with datasets." Shadbolt is a member of the UK Government's Public Sector Transparency Board, which was set up by the Prime Minister to drive forward the government's transparency agenda. See Shadbolt (2011).


References:

1.Advisory Panel on Public Sector Information. (2014) What is the Value of Open Data?

2.Capgemini Consulting. (2013) The Open Data Economy Unlocking Economic Value by Opening Government and Public Data.

3.Economist. (2015) Open government data - out of the box.

4.European Commission. (2015) Creating Value through Open Data.

5.Gurin, J. (2014) Open Data Now.

6.McKinsey & Company. (2013) Open data: Unlocking innovation and performance with liquid information.

7.OECD. (2015) Assessing government initiatives on public sector information.

8.Open Data Hong Kong. (2015) Comment on the revamped Data.Gov.HK site.

9.Open Knowledge. (2015) Global Open Data Index.

10.Open Knowledge. (2016) Open Data Handbook.

11.Shadbolt, N. (2011) Open for Business.

12.The World Bank Group. (2016) Open Data Essentials.

13.Prescribing Analytics (2016).

14.W3C. (2016) Data on the Web Best Practices.