The Glasgow branch of LDNA attended this year’s Mozilla Festival (better known as MozFest) to help discuss the potential for linguistics to shape future development of the Internet of Things (IoT). The IoT is a catch-all term for the growing number of networked devices with which we interact every day, many of which – such as Amazon’s Alexa – require the ability to respond to spoken language commands. The Mozilla Foundation is increasingly concerned that voice data is controlled by corporate entities, and is attempting to gather its own source via the Common Voice project. In collaboration with Glasgow researchers on the AHRC’s Digital Transformations theme, myself and Marc Alexander spoke at a session designed to showcase some of the ways in which linguists gather, annotate, and process language data.
MozFest itself is a sprawling event, taking over Ravensbourne College in the North Greenwich area of London, and covering it in semi-impromptu meeting, discussion, coding, and maker spaces. It brings together enthusiasts for technological development, and puts otherwise disparate groups in the same place, allowing them to share ideas. Amongst the discussions I attended were an introduction to an open access academic publication platform, Aletheia; a meditation on the ways different types of media consumers might identify fake news; and an account of a project which encouraged hospital patients to design physical objects as repositories for electronic recordings of important stories and memories.
The atmosphere of the event was very similar to public engagement work the Glasgow team has previously undertaken, for example at the European Researchers’ Night ‘Explorathon’ events. The mixture of different people and different subjects offers inspiration often through unexpected juxtaposition of projects and ideas. We were, almost certainly, the only linguists present. When you consider, however, the extent to which the future of human-technology interaction is likely to rely on voice and language processing, I hope that our contribution to the field is only just beginning. There is a wealth of established knowledge and groundbreaking research being conducted on language that surely must be useful to the people who are teaching machines to understand us – from the ways in which communicative data is encoded in grammatical structures, to LDNA’s interest in the way concepts can be identified and traced through their textual contexts.
The discussion following the team presentations raised points that had never before occurred to me. Instead of expanding the degree to which computers understand human language are we racing towards limiting our vocabulary and speech patterns to only those which we already know can be understood? Is this, then, a good thing or a bad thing? How does an automatic language analysis system keep up with changes to grammar and syntax even if it could be perfectly trained on the language of today? My hope is that we gave some attendees something to think about, and appreciate the feedback and discussion which certainly broadened my horizons.