Fun with ngrams
As everyone will know by now, GoogleLabs has created a database of words from Google Books and presented an on-line tool allowing users to create plots of ngrams.
Details of how all this works are briefly outlined in a help page and a research paper in Science.
I have to admit I think it’s pity that Google–wealthy business that it is–didn’t pay to have the article presented as open-access. (Perhaps in another journal.) Surely it’d better promote their tool if they did?
In any event, Ed Yong has this covered on his blog. Rather than review the work, I’m offering this as an invitation to amuse yourself (i.e. holiday entertainment) in a less rigorous manner as suits the season…!
Below I’ve included a small number that I have generated with a little experimenting to encourage readers to play.
Share your favourites in the comments.
Equal recognition, at last? – in words at least :
Less morbid writing as the years go by? –
The rise of new technologies (note the spike for ‘computer’ at about 1906; hence my using no smoothing for this graph) –
A tale of two competitors? –
Other articles in Code for life:
Book review – The Poisoner’s Handbook
Career paths, redux – the academic research career is the exception
0 Responses to “Fun with ngrams”
For computer geeks, there’s a list over at ArsTechnica that’s worth looking at:
http://arstechnica.com/tech-policy/news/2010/12/history-of-computingin-handy-graph-form.ars
Although, but what’s with leaving out the ZX80 in the computers. (Or, in my case, the TRS 80.)
I’d also question not they trying C++ (in place of C) in the programming languages, but looks to me as if this isn’t recognised as a word in the database the ngrams are based on. (Or, alternatively, that’s it’s stripped back to just ‘C’.)
Quickly tried comparing ‘vaccine’ and a number of diseases targeted by vaccines. Results are fairly complex, though:
http://ngrams.googlelabs.com/graph?content=vaccine%2Csmallpox%2Cmeasles%2Cmeningitis%2Crubella&year_start=1800&year_end=2000&corpus=0&smoothing=3