The Government Data Portal for Germany GovData.de
Konrad Johannes Reiche | September 12, 2013 | Nancy, France
Open Government Data Government Data
Motives
– Transparency – Innovation – Participation – Efficiency
Core Elements
– Machine-readable data – Licenses
– Accessibility
Open Government Data Venn Diagram by justgrimes
Open Data Example #1 mundraub.de
Many fruit bushes and fruit trees are unused
Wild fruits, private grower, organizations
Data about these plants is collected and published on http://www.mundraub.de
Data comes from people and administrations who submit their knowledge for the public use
by hybrid.moment
Open Data Example #2
Glass Recycling Container in Berlin
Glass Recycling Container in Berlin – City
– Private Organizations
The Berlin Cleansing Department (BSR) has to clean around them
Problem BSR has no information about the location of many containers
Solution Local administrations do have data about the container’s location and help the BSR by making these data publicly available
by Andreas Möllerby pixelroiber
Metadata…
…and Harvesting
Data is stored and managed distributed
Why? Centralized data is hardly feasible and beyond administrative
Heterogeneous data, distributed competence, conflict of interests
Metadata is used to describe the data
Distributed data storage with central metadata portal
Harvesting: Copying of metadata for making the data accessible, too
Portal
Document
Dataset
Dataset
Document
Starting Point
Data Portals of the Länder
Germany is a federalism
– Consisting of 16 states (Länder) – Administrative power divided
Different Data Provider and Portals – Bavaria
– Berlin – Bremen
– Federal Statistical Office (Destatis) – Hamburg
– Rostock
– Environmental Information Portal (PortalU) – Geo Data Infrastructure Germany (GDI-DE) – Rhineland-Palatinate
– and more…
?
DeStatis (David Liuzzo)
Government Data Portal for Germany GovData.de
Launch February 19, 2013
http://www.govdata.de
Prototyped at Fraunhofer FOKUS
Different type of data – Datasets
– Documents – Applications
Focus on free licenses
– German Data License (de-dl,…) – Creative Commons (cc-by,…) – ...
Quantification in Numbers
February, 2013
– Datasets: 1,123 – Documents: 12 – Applications: 25 – Daily visitors: 2,000
March, 2013
– Daily visitors: 500
August, 2013
– Datasets: 3,797 – Documents: 230 – Applications 15 – Daily visitors: 300
Open Data Licenses on GovData.de
Building GovData.de Strategy
Repository software: CKAN (Comprehensive Knowledge Archive Network) – Data catalogue for storing and distributing data
– Developed by the Open Knowledge Foundation (OKFN) – Prevalent format: JSON
– API offers REST Interface
Metadata Schema (OGD-Metadata)
– Structure used to standardize and unify metadata by data providers – https://github.com/fraunhoferfokus/ogd-metadata
– JSON Schema, keep it simple (few fields), e.g. document data origins – Why the hassle? Different data providers: very heterogeneous data – Make data accessible: unification needed
– Schema not a mere tool, but communicator
Metadata Schema − Example
Field Subfield Value
Name waste-management-statistics-2013
Title Waste Management: Disposal and Treatment Facility
Author Statistical Office
Maintainer Juliane Sanger
Tags Hessen, Berlin, Visualization, Classification
… … …
Extras
Terms of Use ID: cc-byURL: http://creativecommons.org/licenses/by/3.0 Spatial Coordinates: [[15.02, 47.16], [15.02, 47.16]]
Original Portal http://www.regionalstatistik.de
… ….
Architecture GovData.de
Portal (Liferay)
Information Pool (Web Portal + CMS)
User Interface for the Data Catalog Indexer + Thesaurus
CKAN
CSW/CKAN Harvester
REST Interface
Browser
Apps
Web Sites of Public Authorities
Subject Catalogs (Geo Data, etc.) Open Data Catalogs (Berlin, Bavaria, Bremen, etc.)
REST Interface
What’s next?
Outlook
Open data has to be understood as a process
Active communication with current, but also to-be data providers to get more data, but especially more interesting data to GovData.de
Quality of metadata plays a crucial role
– Influences the discoverability and searchability – Needs to be improved constantly
GovData.de and its metadata schema should not be an isolated application – Schema compatibility with Government Data Austria (data.gv.at)
– DCAT: RDF vocabulary to facilitate interoperability between data catalogs