The Zen of Open Data

By Chris McDowall 12/10/2010

This morning I was writing code in a programming language called Python. I hit a sticky problem and turned to an arcane feature of the language known as the “The Zen of Python” for guidance. There I read the words, “In the face of ambiguity, refuse the temptation to guess,” and I was enlightened.

Open data has been on my mind lately. Open data is a philosophy and practice advocating that data should be freely available to everyone, without restrictions. Following the experience I related above, I began to wonder what “The Zen of Open Data” might look like. I wrote something over morning tea that tries to boil down all the stuff I have heard and read on the topic over the past two years and posted it to the New Zealand Open Government Ninjas forum.

I share it below for anyone who might be interested.

Image by Sienna.

The Zen of Open Data, by Chris McDowall

Open is better than closed.
Transparent is better than opaque.
Simple is better than complex.
Accessible is better than inaccessible.
Sharing is better than hoarding.
Linked is more useful than isolated.
Fine grained is preferable to aggregated.
(Although there are legitimate privacy and security limitations.)
Optimise for machine readability — they can translate for humans.
Barriers prevent worthwhile things from happening.
‘Flawed, but out there’ is a million times better than ‘perfect, but unattainable’.
Opening data up to thousands of eyes makes the data better.
Iterate in response to demand.
There is no one true feed for all eternity — people need to maintain this stuff.

Many people inadvertently contributed to this text. One particularly strong influence was a panel discussion between Nat Torkington, Adrian Holovaty, Toby Segaran and Fiona Romeo at Webstock’09.

0 Responses to “The Zen of Open Data”

  • Chris, I think what you have written is beautiful, and poignant. I submit, for your consideration, the Four Noble Truths of Open Data:
    1. Life means suffering.
    2. The origin of suffering is email attachments.
    3. The cessation of suffering is attainable.
    4. Open data is the path to the cessation of suffering.

  • @Julian A possible amendment to number 2.

    The origins of suffering are email attachments and data embedded in PDFs.

  • Perhaps a slightly expanded version then.

    The Four Noble Truths of Open Data:
    1. Life means suffering.
    2. The origin of suffering is email attachments in proprietary formats.
    3. The cessation of suffering is attainable through open standards and APIs.
    4. Open data is the path to the cessation of suffering.

  • Sorry Chris
    No zen here…. zen would never say that x is better than y
    Zen would say:
    Better for us that the Lions cage is closed but the dolphin pool is better open, so we can swim with them
    Opaque is more interesting and protective than transparent but transparent saves time imagining
    In order to simplify, one must first understand complexity
    Hoarding can benefit one but sharing may benefit all
    In isolation we may discover the reality of the way things are interlinked( NB this is the art of meditation)
    It is necessary to understand both granularity and aggregation in order to understand how a substance behaves… etc…

  • @Sally Thanks for your comment.

    As I noted in the post above, the poem’s title is a reference to a feature of the Python programming language named, “The Zen of Python”. The “Zen of Python” is misnamed — it should probably be called something like “Fundamental Principles for Programming in Python”. Given that many programmers are familiar with the “The Zen of Python”, and seeing as I borrowed so heavily from its structure, I went with it. Think of the poem as a riff on that poem rather than a reference to a school of Buddhism.

  • This is really quite awesome and I would very much like to use it in presentations (if that’s cool and referencing you of course). May I add a few? “Privacy by design is better than privacy by omission ” And I would tweak the thousands of eyes line. I’m not sure I agree it makes the data better – the data is the data – but it does make the “usability/outcomes” of the data clearer or perhaps more evident? If that’s what you mean by “literacy” well, OK then. Anyway, kudos – I was just put on to this by a colleague and really hits the spot!

  • @Keith. Thanks for your comment. Feel free to use and modify the text as you see fit. All the content on this blog is licensed CC-BY:

    The “Opening data up to thousands of eyes…” line is intended to convey the experience that opening data up to more people will expose errors in your data. If errors are found, they can (hopefully) be fixed. Hence the data gets better.

    I think the work you are doing at is brilliant. I have come across your site several times over the last year. Excellent stuff.