Open Data

World Wide Web Consortium

The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. It was founded in 1994 by Tim Berners-Lee after he left the European Organization for Nuclear Research (CERN). The consortium was founded at the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the European Commission, the Defense Advanced Research Projects Agency (DARPA), which had pioneered the ARPANET, one of the predecessors to the Internet.




Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable.

XML was compiled by a working group of eleven members, supported by a (roughly) 150-member Interest Group. The co-editors of the specification were originally Tim Bray and Michael Sperberg-McQueen. The major design decisions were reached between August and November 1996, and XML 1.0 became a W3C Recommendation on February 10, 1998.


The Resource Description Framework (RDF) is a family of W3C specifications originally designed as a metadata data model. It has come to be used as a general method for conceptual description or modeling of information that is implemented in web resources, using a variety of syntax notations and data serialization formats. It is also used in knowledge management applications.


RSS (RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. These feeds can, for example, allow a user to keep track of many different websites in a single news aggregator.

RDF Site Summary, the first version of RSS, was created by Dan Libby and Ramanathan V. Guha at Netscape. It was released in March 1999 for use on the My.Netscape.Com portal. This version became known as RSS 0.9. In July 1999, Dan Libby of Netscape produced a new version, RSS 0.91, which simplified the format by removing RDF elements and incorporating elements from Dave Winer's news syndication format. Libby also renamed the format from RDF to RSS Rich Site Summary.



Semantic Web

The Semantic Web is an extension of the World Wide Web to make Internet data machine-readable. To enable the encoding of semantics with the data, technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL) are used. Tim Berners-Lee originally expressed his vision of the Semantic Web in 1999 as follows:

I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A "Semantic Web", which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The "intelligent agents" people have touted for ages will finally materialize.

UNICEF’s Open Data

UNICEF’s Open Data leads the collection, validation, analysis, use and communication of the most statistically sound, internationally comparable data on the situation of children and women around the world. It upholds the quality, integrity and organization of these data and makes them accessible as a global public good on the website.


WHO's Open Data

The World Health Organization (WHO) is a specialized agency of the United Nations responsible for international public health. It works to provide the needed health and well-being evidence through a variety of data collection platforms.

World Health Organization



Microformats (sometimes abbreviated μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data (such as contact information, geographic coordinates, events, blog posts, products, recipes, etc.). They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines and aggregators such as RSS.



Open Definition

The Open Definition is a document published by the Open Knowledge Foundation (OKF) (previously Open Knowledge International) to define openness in relation to data and content. It specifies what licenses for such material may and may not stipulate, in order to be considered open licenses. The definition itself was derived from the Open Source Definition for software. OKI summarize the document as: Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).

Linked Data

Linked Data is structured data which is interlinked with other data so it becomes more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages only for human readers, it extends them to share information in a way that can be read automatically by computers. Part of the vision of linked data is for the Internet to become a global database. The term was coined by Tim Berners-Lee.


DBpedia was created with the goal to extract structured content from the information created in the Wikipedia project. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets. DBpedia was initiated in 2007 by Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak and Zachary Ives. Tim Berners-Lee described DBpedia as one of the most famous parts of the decentralized Linked Data effort.



Open Data

On December 7, 2007, a meeting held in Sebastopol, California, was designed to develop a set of principles of "open public data." Attendees at this meeting included Tim O’Reilly, Lawrence Lessig, and Aaron Swartz. The result was a publication of the 8 Principles of open public data. Open data is data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike. is a U.S. government website launched in late May 2009 by the then Federal Chief Information Officer (CIO) of the United States, Vivek Kundra. aims to improve public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. The site is a repository for federal, state, local, and tribal government information, made available to the public. is a UK Government project to make available non-personal UK government data as open data. It was launched in closed beta in September 2009 and publicly launched in January 2010.


Panton Principles

The Panton Principles are a set of principles which were written to promote open science. They were first drafted in July 2009 at the Panton Arms pub in Cambridge. The principles were written by Peter Murray-Rust, Cameron Neylon, Rufus Pollock, and John Wilbanks. They were then refined by the Open Knowledge Foundation and officially launched in February 2010.

World Bank's Open Data

In 2010, the World Bank published its statistical databases and challenged the global community to use the data to create new applications and solutions to help poor people in the developing world. It has provide free, open, and easy access to its comprehensive set of data on living standards around the globe.

World Bank


Open Data Institute

The Open Data Institute (ODI) is a non-profit private company limited by guarantee, based in the United Kingdom. Founded by Sirs Tim Berners-Lee and Nigel Shadbolt in 2012, the ODI’s mission is to connect, equip and inspire people around the world to innovate with data.

Open Data Institute



Wikidata is a collaboratively edited multilingual knowledge graph. The creation of the project was funded by donations from the Allen Institute for Artificial Intelligence, the Gordon and Betty Moore Foundation, and Google, Inc. The development of the project is mainly driven by Wikimedia Deutschland under the management of Lydia Pintscher. On 7 September 2015, the Wikimedia Foundation announced the release of the Wikidata Query Service, which lets users run queries on the data contained in Wikidata. The service uses SPARQL as the query language.



Open Data Policy

In May 2013 Barack Obama issued an executive order which established the Open Data Policy along with a memorandum from the Office of Management and Budget which supported that policy. These policies were developed as a way to promote economic growth and create jobs. An executive order, “Making Open and Machine-Readable the New Default for Government Information,” and another memo, “Open Data Policy: Managing Information as an Asset,” supported his call to create a more participatory, collaborative and transparent government. The White House’s Project Open Data grew out of these memos. It’s a collection of code, tools and case studies to help agencies adopt open data programs and share resources and information on open data.


The Digital Accountability and Transparency Act of 2014 (DATA Act) is a law that aims to make information on federal expenditures more easily accessible and transparent. The law requires the U.S. Department of the Treasury to establish common standards for financial data provided by all government agencies and to expand the amount of data that agencies must provide to the government website, USASpending. The goal of the law is to improve the ability of Americans to track and understand how the government is spending their tax dollars.

European Data Portal

The European Data Portal is an initiative of the European Commission launched on November 16, 2015. The Portal was created to gather Public Sector Information of the 28 European Member States and the four EFTA countries (these countries are also referred to as the EU28+). The EU28+ countries publish public data on national data portals and geospatial portals. In order to provide one single access point to all of this data, the European Data Portal was created.


FAIR data are data which meet principles of findability, accessibility, interoperability, and reusability. A March 2016 publication by a consortium of scientists and organizations specified the "FAIR Guiding Principles for scientific data management and stewardship" in Scientific Data, using FAIR as an acronym and making the concept easier to discuss.