Source: Getty
Breakfast test: worthwhile research can now take place in the time it takes to eat
Meaningful research into linguistics can now be conducted in the time it takes to have breakfast, thanks to the âtransformativeâ impact of âbig dataâ on the field.
That is the view of Mark Liberman, Christopher H. Browne distinguished professor of linguistics at the University of Pennsylvania, who told a panel discussion that âdatasets are no longer the exclusive preserve of the scientific hierarchyâ and that âany bright undergraduate with an internet connection can access and interpret the primary dataâ.
To illustrate his point during a recent event at the British Academy, he detailed how he had conducted his own âbreakfast experimentâ to ascertain whether there was any truth in the received wisdom that men and older people tend to be more âdysfluentâ in their speech.
Âé¶č
Professor Liberman performed a rapid statistical analysis over coffee and cornflakes of the number of âumsâ and âuhsâ in 2,500 hours of recorded and transcribed telephone conversations, classified by age and gender, that are available online.
While âuhsâ performed as expected, âumsâ seemed to buck the expected trend, leading Professor Liberman to speculate: âAre we seeing a substitution of âumâ for âuhâ, with women leading the way?â Although such quick scans were ânot a substitute for serious researchâ, it took him a mere 60 seconds to access the data, 5 minutes to create the graphs and 45 minutes to post a blog about it on the Language Log website.
Âé¶č
Just as the microscope and telescope had opened up whole new worlds to investigate, he argued, thanks to big data âwe can now observe linguistic patterns in space, time and cultural context, on a scale three to six orders of magnitude greater than in the pastâ.
Also speaking at the Language, Linguistics and the Data Explosion discussion, held earlier this month in conjunction with the Philological Society, were Sali Tagliamonte, professor of linguistics at the University of Toronto, and Philip Durkin, principal etymologist and deputy chief editor of the Oxford English Dictionary.
Professor Tagliamonte considered how different kinds of datasets can track patterns in language variation by sex, age, education and place, and what it reveals about the norms and practices of social groups.
Dr Durkin pointed to the immense value of âhuge new digital resources, such as Early English Books Onlineâ to scholars compiling historical dictionaries. However, he said, it remained to be seen how future scholars would strike a balance between âtraditional reading, human combing of databases, and automated trawling and sketchesâ.
Âé¶č
Register to continue
Why register?
- Registration is free and only takes a moment
- Once registered, you can read 3 articles a month
- Sign up for our newsletter
Subscribe
Or subscribe for unlimited access to:
- Unlimited access to news, views, insights & reviews
- Digital editions
- Digital access to °Ő±á·Ąâs university and college rankings analysis
Already registered or a current subscriber?




