Open data is a principle that dictates that data held internally in organizations should be made available to outsiders. This applies mainly to governments. Governments possess large amounts of data: data about the geographies of their countries, anonymized statistics about their populations, about the economy, data about transport infrastructure, about traffic, about weather. There is a growing understanding in developed countries everywhere that governments should make these data sets available, in machine-readable formats, for free reuse by anyone anywhere, without copyright or royalties. The idea is that society will benefit in two ways. Way number one, opening up government data will encourage transparency in government: good governments have nothing to hide. Way number two, all that data will provide fodder for innovation and entrepreneurship, people will be able to build applications on top of the data, start businesses and create jobs, or if not, at least build useful apps that make people’s lives easier.
That is the theory and politicians everywhere are falling over themselves proclaiming how much they believe in it. But not everywhere are words being converted into actions. Sadly, the country where I live, Republic of Ireland, is not a leader in this field. Very little government data is available for unrestricted reuse or in formats that lend themselves to easy reuse. I will demonstrate this with a concrete example from personal experience.
How I didn’t build an app
The usual scenario for doing something with open government data is that you find a dataset you like, such as population statistics or data about traffic in your city, and you build an app (which could be a mobile app or a website) that uses that data in an interesting or useful way. In countries where government data is available openly, people have built amazing things: apps that visualize populations statistics, interactive crime maps, apps that can help you find a parking space, websites that explain how your tax money is being spent and so on. All you need is a good idea, some coding skills, and some data.
I had an idea for an app once. The idea was there, the coding skills were there, but there was no data available anywhere that I could use, so it turned to nothing in the end. Let me describe what it was I wanted to build, and why I couldn’t.
My app was to work like a hotel review and rating website, only people would be reviewing rented flats and apartments instead of hotels. Think TripAdvisor for apartments. Ireland has a growing rental market, particularly now when a lot of people are finding it difficult to keep a mortgage going and need to rent instead. It is unfortunately a sellers’ market and landlords often get away with nasty things. I wanted to see if I could tip the balance a little by giving tenants a tool to rate their rental experience and to name and shame, as well as to name and praise, the landlords.
To build an app like that, it helps if you know where the apartments are. There is a government body in Ireland called the Private Residential Tenancies Board (PRTB) which is basically a regulator for the rental market and, among other things, keeps a record of every rented apartment in the country for tax purposes. So I went to their website to see if they had any downloadable datasets I could use. After some fiddling I discovered that indeed they do: you can download county-by-county lists with addresses. But the initial excitement subsided quickly. These lists have a few of problems. I didn’t let it bother me that the lists are Excel spreadsheets, which is not great (XML would be better) but usable. More serious is the fact that they come with no licensing information, leaving you uncertain as to how you may use them. But the biggest problem is how the apartments’ locations are given. The lists include postal addresses, and those addresses are messy and inconsistent with each other. You see, Republic of Ireland is one of those countries that don’t use postcodes and so the only way to say where something is located is by giving a street address. But there several ways of quoting any one street address and the lists available from the PRTB make no attempt to standardize or normalize them. Even apartments I know to be in the same building appear on their lists with wildly differing addresses which are near-impossible to reconcile automatically. And even if I wanted to use those addresses, I’d still have no reliable way to translate them into geographical coordinates, so I couldn’t even plot them on a map.
Ireland’s lip service to open data
So much for my wonderful app idea, then. But let’s assume for a moment that I was able to work around the smaller problems: let’s assume I was able to get permission to use the data for my app, and let’s assume I was able to crunch the Excel spreadsheet into a more machine-friendly shape. I would still have this problem with geography. It is impossible to plot the apartments on a map if they are not geocoded.
A second-best approach would be to convert the postal addresses into geographical coordinates by reconciling them with some other dataset of postal addresses. Is such a dataset available anywhere? One obvious place to look is Ordnance Survey Ireland, a state-owned company whose job it is to map the country. The Ordnance Survey functions as a commercial company and if you’re not part of the civil service, you can buy a licence from them to use their data, but they make no data available openly. No luck there, then. Someone else who works with addresses a lot is the state-owned postal company An Post. They have a technology called GeoDirectory which can convert arbitrary postal addresses into geographical coordinates. Alas, GeoDirectory is a commercial service and only available for a fee. So, no luck there either. (To clarify, I am not accusing the Ordnance Survey or An Post of doing their jobs badly. They are doing their jobs perfectly well in accordance with their job description, which is to recover costs commercially. The problem is systemic: they shouldn’t have been given that job description.)
The fact that Ireland doesn’t have postcodes is a problem. Luckily, there are plans to introduce postcodes in the near future and the government has recently awarded a contract to some company or other to develop the system. The postcode system is meant to be very detailed, assigning a unique identifier to each individual property. This is good. I expect the system they will eventually develop will consist of a dataset or algorithm where you can input the postcode and obtain the geographical coordinates of the place. Will this dataset or algorithm be available openly? It appears it will not. The minister in charge has announced that it will be available for licensing commercially. There has been no mention of making postcodes available as open data.
All in all, app developers have a hard life in Ireland if they want to include any kind of geocoding in their apps. In the end, you have to rely on “commercially free” services like Google Maps or Bing Maps which do not come with an official stamp of accuracy and which are only partially bilingual (most Irish placenames Google knows about are in English only, while in actual fact every place in Ireland has two names, one in English and one in Irish).
Compare this to the situation in Estonia, probably one of the most computerized countries in Europe, where they have a very centralized and very formalized addressing system. Every addressable premises, from the smallest flat to the largest campus, has a unique identifier in some central database somewhere, and this data is available to everybody who has a bona fide need to use it, both inside the state administration and outside (think banks, delivery companies).
Well, so much for geographical data in Ireland. I suspect the situation is not much better in other areas. Does Met Éireann (Ireland’s weather agency) offer any open data? No. Does any of the many traffic and transport agencies offer any open traffic data? No. Are election results freely available in machine-readable formats? No. Is the data held by the Companies Registration Office machine-readable by outsiders? No.
Over in the UK, the Open Knowledge Foundation maintains an Open Data Index where countries are rated by how openly they make data about themselves available. When you look at the leaders on that list, you notice it is mostly Nordic countries like Norway and Sweden, which have a long tradition of government openness that pre-dates open data, and globally important English-speaking countries like the US and the UK, which have strong IT industries and strong traditions of democracy. Ireland has the prerequisites to be among them. But it is not anywhere near the top of the list. Ireland is doing worse than Switzerland, a country notorious for its secrecy!
It seems that the Republic of Ireland is doing the least amount necessary to live up to the letter, but not the spirit, of various open-data conventions it has signed up to. Ireland is a member of the Open Government Partnership, but the practical effects so far have been minimal. Ireland is required to publish government data openly under the European Union’s Public Sector Information Directive, but it is seriously flouting its responsibilities in that area. It’s one thing to talk to talk, another to walk the walk.
We need a national data spring clean
Not to be complaining all the time, there are little glimmers of hope here and there. The Central Statistics Office has recently published some linked open data from the 2011 census (although the licence is unclear). And at least one county council, Fingal County Council, has had a long-standing open-data policy for years, they have an open-data portal and even sponsor app development competitions.
But we need more data openness than that in Ireland, even for the government’s own benefit if not for outsiders like me who just want to build apps. Let’s return to my example with the Private Residential Tenancies Board. Because their data is not geocoded, it is difficult to cross-check it with other datasets that may exist elsewhere in the civil service. Imagine the Revenue Commissioners (Ireland’s tax collector) would like to verify that everybody who claims rent tax credit (yes, Ireland has a tax break for people who pay rent) actually lives in rented accommodation. This is time-consuming (and therefore, money-consuming) if the addresses are not geocoded in some standard way on both sides. If they were geocoded, using postcodes or unique address identifiers à la Estonia, the task would be trivial and could even be performed automatically every time somebody submits a claim for the credit.
So, the state is losing on many fronts by keeping its data in a bad shape. Badly organized data causes inefficiency, while closed data is a hindrance to transparency in government and innovation in industry. How can we fix it, then?
In countries and cities that have a successful open-data policy, the agenda is usually driven by a single dedicated individual. They often sit fairly high in the admistrative hierarchy and have job titles like ‘Chief Information Officer’ or ‘Head of IT’. An example is Johann Mittheisz, the former Chief Information Officer of Vienna, who was awared the European Data Innovator prize at this year’s European Data Forum in Athens. In the space of a few years, Mr Mittheisz and his team turned Vienna’s city administration around from a closed data silo to a transparent open-data greenhouse.
The more I think about it, the more I agree that we need somebody like that in Ireland. A senior civil servant or possibly even a minister, but certainly somebody who both “gets” data and has the power to make change happen. Ireland needs a Minister for Data.
Whether we like it or not, an unavoidable data revolution is happening around the world, forcing governments to clean up their data internally and make them available externally. Ireland can choose to either join the leaders, or play catch up with them later. It is not too late yet to choose the former.