Recently we had the International Film Festival where I live. I don’t get out as much as I should, and so for my enlightenment as much as my entertainment I decided to catch a few flicks. If you’ve done similarly, even if the ‘film’ was Godzilla or a bad YouTube video, you may be aware of what happens when the voiceover doesn’t match the movements of the actors’ lips. Utter frustration. Sometimes the synchronisation is so bad that you just begin to ignore the mouth movements. But sometimes the mouth movements are almost, but not quite synchronised to the voice. When this happens the words can sound weird or blurred.
This effect was first described in the 1970s by a Scottish Cognitive Psychologist called Harry McGurk, who studied what our brains do when we talk to someone else. He was particularly curious to understand the importance of visual information, like the way a person moves their mouth when they speak to us.
What McGurk found was that we use both the sound of the speech and the visual information of the mouth movements in order to understand what another person is saying. This might not be obvious, but think about what you do when you have difficulty hearing what someone is saying – you watch their mouth for clues about what they’re saying.
The less obvious thing that McGurk discovered was that when the spoken words and the mouth movements do not completely line-up, we pay more attention to the visual cues. To me that seems odd – if we are struggling to understand spoken, or auditory, information why would our brains prefer visual cues over the sound of the speech? Why do our brains require vision in order to hear? The short answer is that we don’t know, but new Neuroscience research  comes a bit closer to understanding what the brain does when the McGurk effect happens.
Researchers used subjects who were undergoing brain surgery for severe epilepsy. During this type of surgery patients remain awake, so the researchers showed them videos and measured the neural activity in different regions of their brains.
Subjects watched three types of video: where the audio/visual was on target, moderately mismatched (where the McGurk effect would happen), or severely mismatched. In the first and last types of video, the patients auditory brain areas showed high activity. Visual areas also were active, but the neural activity carrying sound information was dominant. Interestingly, however, when the McGurk effect happened the visual areas took over, dominating the neural information flow in the brain, while auditory regions became less active.
The downside to the study is that the researchers only looked at 4 subjects; 2 males and 2 females. Furthermore, the brain activity was recorded with slightly different techniques among the participants, which could lead to slightly different measurements as well as different pitfalls or limitations associated with each of the recording techniques. Finally, there was no independent assessment of hearing, vision, or speech interpretation in any of the participants, nor was there any indication of whether their epilepsy could have affected neural activity in the auditory or visual brain areas. Any one of these factors could have an impact on the results; and fortunately, in some instances these issues were addressed by the authors.
Despite these potentially confounding elements, the reported results are completely consistent with the behaviours that characterise the McGurk effect, where the visual information seems to win out over the auditory information in the brain. This consistency with pre-conceived notions is a very powerful force in our beliefs, and is able to sway scientists and non-scientists alike. Indeed, this might be one explanation for how a study with so many critical unanswered questions makes its way into the literature. Let me be clear: I am not saying that the findings are flawed, just that there is not enough information to state them with such certainty or to rule out other possibilities. And that just because they are consistent with our prior ideas, that there is still a lot of room for further studies that delve more deeply into the unanswered questions; indeed the very same questions that were unanswerable in this study. For example could there be some involvement of additional brain regions where auditory and visual information are integrated with one another?
Outstanding questions not withstanding, this study forms the foundation on which further studies can be performed, novel hypotheses can be formulated and tested in order to decipher the neural mechanisms that give rise to our unique, fascinating, and sometimes frustrating behaviours.
Finally, the press release for this paper claimed that the research shows that next generation hearing aids should have a camera built-in . A multi-modal sensory booster, or aid, such as a device that assists with both hearing and vision, sounds like a useful idea. To me it would be beneficial also to be able to hear auditory information without the McGurk interference from visual information. Thus, new sensory aids might benefit from being user-tunable so that different sensory information dominates in different situations. It would certainly make it less annoying to watch poorly dubbed videos, and quite likely much easier to communicate in a variety of social situations; all-up making it easier to utilise our human brains for doing human things without impediment.
Go here to experience the McGurk effect. http://www.youtube.com/watch?v=aFPtc8BVdJk First listen with your eyes closed. Then watch again with your eyes open.