Andreea S. Calude, University of Waikato and David Trye, University of Waikato
Hashtags are a pervasive feature of social media posts and used widely in search engines.
Anything with the intent of attracting a wide audience usually comes with a memorable hashtag — #MeToo, #FreeHongKong, #LoveWins, #BlackLivesMatter, #COVID19 and #SupremeCourt are just some examples.
First conceived in 2007 by blogger and open source advocate Chris Messina on Twitter, hashtags are now also escaping from social media contexts and appearing regularly in advertising and protest signs, and even in spoken language.
But are hashtags words?
If there is one thing linguists ought to know, it’s words. But when it comes to hashtags, the definition is not straightforward.
In our research, based on a collection of millions of New Zealand English tweets, we argue hashtags are, at best, artificial words.
Problems with words
Let’s first look at how we usually recognise words. The simplest way is by following a native speaker’s intuition.
If you had to identify the words in the previous sentence, you might begin by iterating everything separated by spaces: the, simplest, way and so on. But what would you do with “speaker’s”. Is that one word or two?
Laypeople will likely think of it as one word. Grammarians may argue it’s two, or even worse, 1.5 words: you have the speaker part and the possessive case marker (‘s), which is technically not a word, but not a non-word either (it is a clitic).
But using spaces as clues for word boundaries is a luxury available only to written languages. What about languages that only have a spoken form, such as Tinrin of New Caledonia?
Phonological cues — acoustic “spaces” or short pauses between words — are no more reliable. Many grammar words, such as articles (the, a) and prepositions (to, of, at) are used frequently but typically unstressed and uttered quickly, receiving virtually no “airtime” in the rush of content words like nouns, verbs and adjectives that carry the most important part of a message.
Just about every criterion proposed for words has its own problems, as described by linguists Laurie Bauer and Martin Haspelmath. Despite their seemingly straightforward nature, words are tricky for linguists.
There are two main theories regarding the linguistic status of hashtags. The first claims hashtags are like compound words. This is essentially a way of making new words by gluing two (or more) existing words together. In English, compounds can be spelled as one word (blackboard, greenhouse), or two words separated by spaces (bus stop, apple pie) or as hyphenated words (forget-me-not).
The second idea is that hashtags are words that arise from a completely different process, unlike anything we have seen before. This hashtagging is a much looser word-formation process, with fewer restrictions. As long as a hashtag symbol is used and no spaces appear between the parts, anything goes — #lovehashtagging, #lazysundayafternoon, #MāoriLanguageWeek.
Our research argues against both these proposals by rejecting the notion hashtags should be treated as words. We suggest hashtags are written to look orthographically like words, but their function is much broader and similar to keywords in a library catalogue or search engine.
But just because hashtags aren’t words per se, that doesn’t mean they are not linguistically interesting. On the contrary, we found hashtags allow tweeters to express themselves in many creative ways, and they are used for various functions, including humour and language play.
For example, some tweets start with the hashtag #youknowyoure(a)kiwiwhen or contain #growingupkiwi to reference, in a self-deprecating way, stereotypical Kiwi lifestyle qualities or childhood nostalgia.
In a more serious and controversial vein, in a bid to poke fun at the All Blacks’ performance of the haka before rubgy matches, the hashtag #hakarena references the Māori tribal dance haka and links it to the Latin American song macarena in what some consider a derogatory way.
The hashtags we analysed also showed new ways in which tweeters harness lexical resources from different languages. Hybrid hashtags, as we term them, are hashtags comprising one or more words from two distinct languages — in our case, English and Māori, the indigenous language of New Zealand. Examples include #kiaora4that and #letssharegoodtereostories.
Far from being a source of linguistic demise, social media language continues to help us understand a bit more of the puzzle of human communication.