The Official Information Act has an accessibility problem.
I wrote recently about asking the government for information, having just published a guide to using the OIA. The OIA is a powerful tool, but it can be limited by how government agencies choose to follow it. One particular limitation that comes up, again and again, is accessibility.
There is a hashtag on Twitter that’s often used by journalists and activists, myself included, to talk about issues with the Official Information Act: #fixtheOIA
Today Nikki Macdonald, senior feature writer at the Dominion Post, started a discussion about a widespread and rather infuriating practice for responding to requests for official information:
Is there a reason – other than being obstructive – that government departments insist on sending OIA data as scanned pdfs, so you have to transcribe everything to be able to do anything with the numbers? #fixtheOIA
— Nikki Macdonald (@Nikki_Macdonald) November 16, 2017
Yeah I have done that in the past, with varying degrees of success. Trouble with this one was I wasn't expecting it to include screeds of data.
— Nikki Macdonald (@Nikki_Macdonald) November 16, 2017
In response to a recent request she made under the OIA, she received a PDF of a scanned document that included a large amount of tabular data.
The issue she mentions is one I have seen many times. It is common practice to take a document that will be released under the OIA, print it, then scan it back into a PDF, then send that PDF instead of the original document.
Printing and scanning documents allows for them to be signed, which is typically done with OIA response letters (though not with documents released alongside them). PDFs cannot be altered by the recipients, so the information is kept intact as it was released.
Also, perhaps most importantly, because these PDFs come from scanned documents they don’t contain any text. Instead, they just contain images of text. This means there is absolutely no way the recipient will be able to view information that has been withheld from released documents.
But releasing documents as images of text has a huge drawback too: accessibility
Most obviously, this will affect requestors who are blind or have impaired vision. If you rely on a screen reader to read these documents, it won’t be able to read a PDF that’s full of images. It’s a pain in the ass for those of us who are fully sighted as well.
There is software that can be used to convert images of text back into text, but it doesn’t always work perfectly, and often these image-based PDFs will include tables of data that are only really useful once converted into a spreadsheet. And, frankly, knowledge of how to use specialised software shouldn’t be a requirement for using the OIA to get usable information.
This isn’t the only accessibility issue with the OIA. Government agencies have also been known to do things like refuse to release documents online for unclear “security reasons“, instead telling the requestor that they must pay to have it posted to them.
Recently, Sam Warburton has been documenting a particularly trying OIA response on Twitter. First, he was sent documents that were locked by a password, which prevented him from copying text out of them:
Spreadsheets were provided on a USB, but password protected. The only thing I could do was view sheets.
— Sam Warburton (@Economissive) November 13, 2017
He then made a second OIA request asking for all correspondence regarding the first request, so he could get to the bottom of how this bizarre twist had come about. The response was, well, not very accessible. He was sent 249 pages of printed documents, with the excuse being that it was “too big” to send digitally:
— Sam Warburton (@Economissive) November 6, 2017
I think we should be able to expect better. Far better, in fact. And I’ve written the following open letter to the Ombudsman in the hope that this aspect of using the OIA can be improved:
There has been a discussion on the “#fixtheOIA” hashtag on Twitter started by Nikki Macdonald today about the common practice of agencies sending responses to requests for official information in PDF format, where the contents of the PDF must be either transcribed by hand or converted using specialised software before they are useful.
I’ve observed this practice across several agencies, including those I have most commonly made requests to: New Zealand Police, the Department of Corrections, and ACC. It seems as though the response is written up, then it is printed out and the physical copy is signed before being scanned, and finally the scanned copy is sent to the requestor as a PDF. This means that, effectively, the response is delivered in the format of an image of text.
This has obvious accessibility issues for requestors who are blind or have impaired vision, but is also detrimental to sighted requestors.
For example, I recently received such a response from the Department of Corrections which included some long URLs. It is not possible to copy text from these responses without specialised software that many requestors will be unlikely to know much about (OCR – Optical Character Recognition – software can be used convert images of text into actual text). In this case, I had to type the URLs out by hand. Though there were only three of them in this instance, a lengthier response could easily have made this a significant barrier to accessing the released information.
In the case raised by Nikki Macdonald, I understand she received a PDF response that, unexpectedly, also included large amounts of data. However, because the release was effectively an image of this data rather than a more usable format such as a spreadsheet, it will require a conversion process (either by hand or via OCR software) before it can be used.
I received a similar response in 2015, when I requested documents from the Ministry of Health but did not specify a preferred format. One of the documents I had requested was a large spreadsheet, but it was released to me as a PDF. Though this particular PDF was at least text-based, I still had to use specialised software to convert it to a usable format. I have made the PDF I was sent available online: compliance-of-comp-meds-manufacturers-2-oct-2007.pdf
While this issue can be pre-empted by requesting that the information should be released in a specific accessible format, such as an XSLX or CSV spreadsheet for data, this is not always realistic when it’s unclear whether or not the response is likely to include this type of information.
Despite section 16 of the OIA laying out certain responsibilities here, this approach is also not always respected by the agency. I have received multiple datasets as image-based PDFs despite requested explicitly as XSLX or CSV spreadsheets.
Here is an example of one response where I had to transcribe data by hand. The full response contains more data, as it continues onto the next page:
I have also assisted other requestors in transcribing information received this way, during which I was reminded that it can be quite a painful and time-consuming process for a requestor who is neither a fast typist nor able to use specialised OCR software.
In cases where OCR software is used to convert an image-based PDF response into a text-based one, which allows responses to be searched for certain keywords, this is often hampered by “Released under the Official Information Act” watermarks that are commonly overlaid on each page of released documents. These watermarks interfere with the software’s character recognition, so that portions of the text must instead be transcribed by hand. Poor quality scans, such as misaligned pages, can also cause issues with using OCR software to convert released information into an accessible format.
OCR software can also incorrectly transcribe parts of text, so it is always necessary to compare its output with the original PDF. This can be a time-consuming process, and is inaccessible to requestors who are blind or have impaired vision.
My impression, from talking with other users of the Official Information Act, is that this is a common and frustrating experience, widely regarded by requestors as being unnecessary and obstructive.
I accept that there may be cases where it would be difficult for an agency to release a text-based document with appropriate redactions, and where the reason for each piece of information withheld is clear and the withheld information cannot be accessed. However, there are clearly many cases where information could be released in a more accessible format, and where this would happen if the requestor had known to specify such a format in their request. I would hope that it might be expected that, when information can be released in any of several different formats with similar effort, the most accessible format should be used by default.
For example, after having seen me express frustration on social media about having to transcribe length URLs by hand from an OIA response, the Department of Corrections has agreed that they will now include links in the body of the email sent as a response. Although this only affects a small part of the response, this change would make the URLs significantly more accessible by allowing them to be used directly, without having to transcribe them from the PDF attachment.
I have looked through the Official Information Act and the official information legislation guides on your website for guidance on this accessibility issue.
The closest thing I have found is the relatively recent case note regarding a request from a prisoner who was unable to access information publicly available via the internet, after their request was refused under section 18(d).
Unfortunately, I have not found anything specific to this issue of accessibility. I’d greatly appreciate any advice you might be able to give on making requests in a way that should make a more accessible response more likely.
I would also like to suggest that a guideline on this issue might significantly improve both the experience of making requests for official information and the accessibility of released information.
I think it would be reasonable to expect agencies to release information with an accessibility-first mindset. This would be consistent with standards such as the New Zealand Government Web Toolkit, which recommends the following:
“Choose formats that are both easy to use and most likely to be accessible in the future. Choose openly documented formats over closed formats, or ensure that material released in closed formats is also accompanied by an equivalent in open formats.”
Particularly as many requests are made publicly available, either by the agency, through a service such as FYI.org.nz, or by direct sharing from person to person, it is important to consider issues such as accessibility for the blind and visually impaired for every request.
It is my opinion that accessibility should be considered an important part of information being made “available”, as per the Official Information Act’s Principle of Availability.