Verbalising data

By Aaron Schiff 11/02/2015


I just read this book in which over three days in 1974, Georges Perec attempted to record everything that happened in the Place Saint-Sulpice in Paris. Here is a typical passage:

A bus (Globus) three-quarters empty
A lady who has just bought an ugly candleholder goes by
A small bus goes by: Club Reisen Keller
Bus. Japanese.
I’m cold. I order a brandy
A car goes by, its hood covered in dead leaves
A motorcyclist goes by, pushing a very new red Yamaha 125
For the umpteenth time the 79 rue de Rennes auto-driving school car goes by
A little girl with a blue balloon goes by
For the second time a meter maid in slacks goes by
Beginnings of a traffic jam in rue Bonaparte
Lots of people, lots of cars
A man goes by, eating a cake (the reputation of the neighbourhood confectioners is not to be doubted)
A bus: Paris-Sud buses: are they tourists?
The bells of Saint-Sulpice begin ringing, maybe for the wedding. The big doors of the church are open.
Paris-Vision bus
The bridal procession enters the church
Traffic jam in rue du Vieux-Colombier
The buses are at a virtual standstill on the square

In essence the book is a presentation of data — a data verbalisation. Perec could instead have made charts of his observations but his prose does a very good job of capturing the life and rhythms in the Place. After reading the book you have an overall sense of what happened and the atmosphere, but specific facts are a blur.

The book got me thinking about presentation of data more generally and how much accuracy and detail is required. For example, here is a chart of international tourist arrivals to New Zealand during 2014, by country of residence:


What do you care about in this chart and what will you remember from it? If you work for an airline or an airport you might care about all the details and all the numbers. But I bet the average person only cares about the overall picture and will only remember a few key facts (if anything).

Some key facts could instead be verbalised: In 2014, Australia was the most common country of origin for visitors to New Zealand, with around one and a quarter million visitor arrivals. Next was China with just over a quarter of a million arrivals, followed by the US and UK with around 200,000 arrivals each. There were smaller numbers of arrivals from many other countries, bringing the total for the year to around 2.8 million.

Beyond this, unless you are particularly interested in this data, your ability and motivation to understand and remember facts probably diminishes rapidly. The advantages of a verbalisation are that it presents the key facts quickly and simply without requiring the reader to have any special skills in data interpretation. However, it is difficult to write a truly emotionless description of the data and there is a risk that the writer’s own biases will affect the interpretation. (There is also a question of what facts to include or exclude, but that question must also be confronted when making a chart, for example when I decided which countries to put in the “All other” category (those with fewer than 5,000 arrivals).)

The verbalisation that I wrote above is efficient at conveying a few facts, but also somewhat boring (again, unless you are in the tourism industry). A page filled with such paragraphs might not attract much attention. In contrast visualisations can generate more interest, while sometimes being less efficient and requiring more effort to understand. As always, trade-offs exist and there is no best approach; consideration of the audience is essential.

Reading Perec’s book, I also realised that verbalisation plus human imagination can be extremely effective and efficient at conveying information. For example:

Umbrellas sweep into the church

With these few words, the skilled writer is able to convey a lot of information about this scene — you can imagine the weather, the mood, and the movement. I’m wondering whether such an approach can be used in the verbalisation or visualisation of data more generally, without straying too far away from the facts and into emotional territory.