By Leslie Young
PDFs are a fact of life in data journalism. Most of the time, when you request “an electronic file” or “an electronic document” in your Freedom of Information request, the end result is a PDF.
While it’s a step up from simply getting a stack of papers as your response, a PDF response is annoying in a number of ways. It’s hard to work with. Unlike a spreadsheet format, (Excel, .CSV, etc.) you can’t analyze a PDF. You often can’t copy and paste or export the data – sometimes Acrobat won’t even recognize the document as text!
Some departments do respond with an Excel file. Some are even nice enough, when I ask, to email me the Excel version of the PDF response that they had previously sent me. They don’t have to do this, and I really appreciate it.
But it seems like the default electronic file format is the PDF, which means that I will spend hours trying to force the information into a friendlier format. It doesn’t stop me from doing the story, it just makes it more difficult.
So why do many departments seem to favour the PDF? I decided to ask the Treasury Board Secretariat, the body charged with administering the federal government’s Access to Information legislation.
Here’s what they said.
I would like to know why, when a requester requests an electronic document, the response is usually provided as a PDF.
Why do departments seem to prefer releasing information as PDFs instead of a more open electronic file format, such as an Excel spreadsheet? This is particularly relevant in the case of a request for information from a database, which since it’s a table filled with numbers, would be more useful to a journalist in an Excel or other format.
Our government is committed to openness and transparency which is why we are pursuing the Open Government initiative that will continue to make government data freely available, and currently requires all completed ATI summaries to be posted online within 30 calendar days of being readied. Current Access to Information regulations direct departments to provide information in the format requested wherever possible, and our government continues to update and add to the already hundreds of thousands of data sets and the amount of information available to Canadians online in various formats. Where alternative formats are not available or suitable, the government will respond with a pdf version in order to ensure that requests for information are still carried out effectively.
So it seems all you need to do is ask – very specifically. It’s a valuable lesson. Next time, I will make sure to ask for a .CSV, and see what happens.