Tag Archives: Sheffield

Talk About Change: LDNA at Festival of the Mind

Last weekend, Linguistic DNA & friends took over the Spiegeltent in Sheffield city centre, as part of the University’s Festival of the Mind. Spiegeltents are a Belgian invention–tents decorated internally with mirrors, creating the perfect space to share myriad reflections.

Over the course of two hours, we hosted a performance of new writing that emerged from collaboration with Our Mel (a Sheffield-based social enterprise dedicated to exploring cultural identity) and novelist Désirée Reynolds. Each of the pieces performed have also been published as part of a limited edition anthology: “Talk About Change: Writing as Resistance”.

The Researchers’ Introduction outlines a little more of the process that culminated in some extraordinary writing (excerpted from the print anthology):

Talk About Change: Writing as Resistance

Funded by the University of Sheffield’s Festival of the Mind, our collaborative workshops used examples of early modern word use (from the Linguistic DNA project and related research) as a starting point to think about language use today. How can the past speak to the present? How might the present speak to the past

As reflected in the structure of this anthology, the workshops explored four central themes: diversity, feminism, immigration and race. These were selected by Annalisa and Désirée, who also provided the extra focus on “writing as resistance”. In each case, the Linguistic DNA researchers sought to introduce historic material that might prompt conversation about the themes—and perhaps even fuel the resistance. Some input drew on prior research (especially for feminism and immigration sessions, which drew on Iona’s thesis and engaged also with the 500 Reformations project). As often, it was a basic excursion into early modern material—with a beginners’ introduction to linguistics and studying meaning (courtesy of Seth)

The most inventive work happened when we brought this material into the open sessions

Together with all who attended the workshops, we compared the role of diversity in historic texts to its position in modern culture: what once characterised a multiplicity of opinion is now used paradoxically of something individual. We considered aspects of feminist debate before the word feminism existed, exploring how the power of virtue changed as men (mostly) discussed the role of women in sixteenth-century England. Using texts about strangers, we examined parallels between the way people wrote (and complained) about early modern outsiders and modern discourse about immigrants. We reflected on the roots of race, its links to kinship, descent, and community and the relationship between structures of language and structures of power.

In each session, novelist and creative-writing facilitator Désirée Reynolds recommended other writings to bring out different dimensions of the themes. Wide reading was encouraged, and what you will find in the pages that follow reflects the careful crafting of a range of experience and inspiration drawing on at least five centuries of language use.

It is Writing as Resistance.

It comes from Talking About Change.

If you would like a copy of the anthology (free!), you can register interest (first come, first served) by filling out a short Google form.

(You can also read some words from the Editor, over on the 500 Reformations website.)

Linguistic DNA at SRS 2018: Abstracts

Knowledge, truth and expertise: experiments with Early English Books Online

Wondering what Linguistic DNA is bringing to the Society for Renaissance Studies? Here are the abstracts for two panels of papers, and information about our hands-on demonstration session (drop in).

United by a common interest in data-driven approaches to meaning and a focus on the transcribed portions of Early English Books Online (EEBO-TCP), this interdisciplinary panel brings together new research from the Linguistic DNA project and the Cambridge Concept Lab.

What is EEBO anyway? Contextual study of a universe in print
Iona Hine and Susan Fitzmaurice (University of Sheffield)

Since 2015, the Linguistic DNA team has been developing methods for mapping meaning and change-in-meaning in Early Modern English. Our work begins with the hypothesis that meanings are not equivalent with words, and can be invoked in many different ways. For example, when Early Modern writers discuss processes of democracy, there is no guarantee they will also employ a keyword such as democracy. We adopt a data-driven approach, using measures of frequency and proximity to track associations between words in texts over time. Strong patterns of co-occurrence between words allow us to build groups of words that collectively represent meanings-in-context (textual and historical). We term these groups “discursive concepts”.

The task of modelling discursive concepts in textual data has been absorbing and challenging, both theoretically and practically. Our main dataset, transcriptions of texts from Early English Books Online (EEBO-TCP), contains more than 50 000 texts. These include 9000 single-page broadsheets and 162 volumes that span more than 1000 pages. There are 127 items printed pre-1500, and nearly 7000 from the 1690s. The process of analysis therefore requires us to think carefully about how best to control and report on this variation in data distribution.

One particular question that has arisen affects all who attempt to use EEBO: what is in it? To what extent is its material from pre-1500 similar in kind (genre, immediacy, etc.) to that of the messy 1550s (as the English throne shifted speedily between Edward VI and his siblings), the 1610s (era of Shakespeare and the King James Version), or the 1640s (when Civil War raged)? This paper is a sustained reflection on attempts to find out “What’s in EEBO?”

In the beginning was the word?
EEBO-TCP and another universe of meaning
Seth Mehl (University of Sheffield)

When a new idea is conceived, how does it find expression in language? Between 1450 and 1750, the English lexicon expanded dramatically, and literary scholars, philologists, linguists, and historians have sought to document and demonstrate the paths taken by key social and cultural vocabulary, charting the history of what would become key social and cultural ideas, discourses, and concepts. In such cases, the topic and language for investigation has been intuited on the basis of extended qualitative reading, and the objects of investigation tend to be individual words. With the advent of a searchable database of early modern texts, such intuitions can be tested at scale, and the initial object of inquiry can shift from individual words to relationships between sets of words.

What happens when we invert the traditional process, taking the thousands of texts digitised in EEBO-TCP and applying computational techniques to model language change independent of human intuition? Can such techniques indicate meaningful relationships between key words that human researchers had not intuited or observed? To what extent do observations founded on over 1 billion words of early modern English correspond to and diverge from what scholarly readers have already inferred? Is it possible to identify discourses around key ideas even when the apparently related key words are absent? Combining insights from the Keywords Project with tools developed by the Linguistic DNA project, this paper will explore how concept modelling can be applied to re-examine meaning in early modern texts.

Beyond Power Steering:
re-constituting structures of knowledge in 17th-century texts
John Regan (University of Cambridge)

One of the axioms of the Cambridge Concept Lab is that digital means of enquiry should provide qualitatively new kinds of knowledge, if we are to realise their full value. This is to say, that computation should not merely provide ‘power steering for the humanities’, but allow one to discover something different in kind about how knowledge was structured in the past.

Making good on this axiom necessitates judgements on the part of the user of digital technology about how to design one’s modes of address to (for example) natural language data sets such as Early English Books Online- TCP, in order that one is not only adding ‘power steering’ to existing, familiar types of enquiry. It also necessitates making decisions about when to come to rest at results (that is, when to cease enquiry); judgements of where digital data can be said to be producing discrete and unfamiliar forms of knowledge.

This paper will present tentative first signs of what the Cambridge Concept Lab believe are historically-discrete conceptual structures, based on data from the early seventeenth-century portion of EEBO-TCP. Two such structures will be described, one entitled ‘Mutual Dependence’, the other ‘Self-Consistency’. As will be shown, familiar forms of knowledge that are held and expressed in sentences and paragraphs, organised by grammar and understood by readers largely as explicit sense, may be contrasted with this evidence of qualitatively different conceptual structures in the textual record. While this paper does not set out to debunk existing theories of the structuration of knowledge and its transmission in the seventeenth century as have become established through centuries of close reading, it does seek to enrich our understanding of these traditions by attending to conceptual, and not exclusively semantic, thematic or rhetorical, structures.

It appears uncontroversial to assert that concepts are determining with regard to features of language use such as explicit and implicit semantic fields, theme, word order, and syntactic relations at the level of the sentence. Nevertheless, recognising that concepts have lexical and semantic extension is not the same as accepting that the two are identical in kind. This paper’s claims about conceptual structure will be based upon evidence from the early decades of seventeenth-century data from EEBO-TCP.

Our afternoon panel is a little depleted (by ill-health) but features Jose M. Cree (Sheffield) on Neologisms and the English reformation, Lucas van der Deijl (Amsterdam) on The collaborative Dutch translations of Descartes by Jan Hendrik Glazemaker (1620-1682), and a little extra time for discussion.

DROP-IN SESSION

All SRS delegates are very welcome to drop in to our demo workshop, where we will be providing a 10-15-minute introduction to our tools (3:30pm, repeated at 4:30pm) and the opportunity for hands-on experimentation. This is in the Hicks Building, Floor G, room 29. (About 2 minutes walk from Jessop West, across the main road and a little uphill. Directions.)

Snapshot from campus map, featuring the Hicks Building.

Showcasing Linguistic DNA

On Saturday 11 March (2017), some of the LDNA team took part in a Showcase as part of the University of Sheffield’s Festival of the Arts and Humanities. The event took place at Sheffield’s Millennium Galleries, allowing members of the public to discover different aspects of humanities research presented through exhibitions, activities and short presentations. Visitors found information about literature or archaeological findings, had the possibility to try out different instruments or take an implicit bias test brought in by the Philosophy Department. We asked Sheffield postgraduates Nadia and Winnie to reflect on their experience preparing for and staffing a stall as part of their MA work placements.

Winnie writes:

I prepared a handout using data from Ways of Being in a Digital Age (WOBDA). The process—zooming from abstract trios extracted from a dataset to see the patterns they made in a small extract of text—was fascinating. At first I was worried about how well the concept would translate to a non-specialist audience, but then I realised that involves negative preconceptions about what a non-specialist audience is: somehow less interested or capable of critical engagement than a specialist one. I therefore decided not to “aim” anything “at” anyone, but instead tried to summarise trios in a way that made the most sense to me as a newcomer to Linguistic DNA’s methods. I chose a single pair (internet + craving), made up a colour-coded table of the items that formed trios with it, and then put this alongside highlighted examples of trios in a journal abstract.

Snapshot of a table showing associations with 'internet' and 'craving', with example text from a social sciences journal.

Illustrating patterns of association with “internet” + “craving” (from Winnie’s handout).

This turned out to be really useful because people were interested in the project from all kinds of angles, some of which changed how I thought about what LDNA does. Fiddling with data on the placement meant I’d got sidetracked in a sense into thinking about WOBDA as a technical exercise, but the Showcase helped me see the bigger picture. Visitors were intrigued by Linguistic DNA as a name; one person was interested in whether the project was making any claims about genetic hard-wiring. Another, an IT professional, was interested in the double helix visualisation on the website, and said it would make him think about his own designs. I particularly remember a conversation with an artist who was interested in researching discourses around disability. We talked about how to query corpora, which tools were available and easy to use, the advantages and disadvantages of the BNC versus the web as corpus, how the age of the BNC might affect the language it contained, and the difference between collocations and discourse concepts as shown in WOBDA. She was also interested in word clouds; the idea of extracting implicit relationships in language and making them visible seemed to be something that appealed strongly to both adults and children who stopped at the stall.

Nadia writes:

Beyoncé and crew salute military-style. Photo by Asterio Tecson.

Beyoncé Knowles performing in Central Park, July 2011. Image copyright (c) Asterio Tecson; used under creative copyright license 2.0.

To prepare for this event, I mostly focused on the YouTube data. I prepared an informative and colourful poster with prominent examples, including images of Beyoncé (left) and the video game World of Tanks, to attract visitors and to suggest that we conduct contemporary research. I also searched our data for some prominently occurring words.

Individual associations, courtesy of @ShefEnglish.

Individual associations at the LDNA stand, courtesy of @ShefEnglish on Twitter.

Because we do not yet have representative results for the Militarization 2.0 work, I often pointed to the Linguistic DNA research as the mother project of the YouTube project. The examples proved very useful since they complemented the information given on the posters. The audience was provided with representative examples from the Linguistic DNA project for them to look at and take away. Moreover, people had the chance to play with word cards and group them together according to their own individual word associations (right).

I observed that many people grouped together words with a similar meaning (such as ‘succeed’ and ‘win’), whereas others clustered together words according to very personal associations. An 8-year-old girl was fascinated by the cards, pairing ‘victory’ and ‘win’; we looked at how these words appear in our given examples and the advantages of having a computer that counts words as lemmas. One visitor told us about his aphasia and how it changed and affected his use of language, which made me realise that next to the Linguistic DNA we are researching, every person has his or her own, very personal linguistic DNA. Another visitor was inspired by the YouTube project and connected language use to social issues, such as the omnipresence of on- and offline violence, providing food for thought for all participants of the conversation.

From my point of view, the event was a success. Many visitors seized the opportunity to have a chat with us, which led to various stimulating encounters and conversations. It was intriguing to see that numerous people were willing to share personal stories and views on language and its importance. The public seemed to engage and identify with our project on many different levels, which confirms how important this kind of research is—not only for the academic community but also for the public.

Also participating in the Showcase were LDNA Research Associates Seth Mehl (below left), who delivered a bitesize talk asking “What can computers teach us about meaning in early English books?”, and Iona Hine (right, during her bitesize talk about “Luther’s Language”).

(Photos courtesy of @DHIShef and D. Clark.)

Text Analytics at Sheffield DH Congress

Earlier in the year (2016), we issued a special call for papers, inviting others to join LDNA panel sessions at the Sheffield Digital Humanities Congress. We were delighted by the responses, and further delighted that the full DHC programme includes plenty of other material relevant to our text analytics’ interests–and a noticeable body of book historical input too.

As a special privilege for those who follow the LDNA blog, here are two bonus abstracts outlining our conception of each LDNA panel:

TA 1: Between numbers and words

Session 4, Friday 9 September
ft. Hine, Shute, Siirtola et al.

Digitisation of texts facilitates kinds of statistical analysis that were previously difficult and perhaps impossible for humans to carry out. This series of papers explores the interface between statistics and close reading, teasing out how these modes of textual analysis can be applied jointly to explore and analyse the material, lexical and semantic form of constitutent texts. We discuss the use of quantitative analysis to reassess hypotheses about the work of compositors in fifteenth-century printing. We scrutinise a blueprint for moving between statistical data and words-in-context within collections too big for human reading (with special attention to concept formation). Lastly, we demonstrate how one newly-enhanced visualisation tool assists exploratory analysis to generate insights about genre and social variables in digital text collections including early modern correspondence and international Englishes.

TA 2: Identifying complex meanings in historical texts

Session 7, Friday 9 September
ft. Mehl, Recchia, Makela, et al.

With recent advances in computational tools and techniques, researchers are moving closer to the goal of identifying and describing complex meanings—semantic, discursive, social, and otherwise—in historical texts. This session approaches that goal from multiple angles. We discuss semantic meaning in terms of distributional semantic techniques, which connect the study of meaning in the humanities with the quantitative study of language in computational linguistics. We discuss discursive meaning via topic modelling techniques, and also explore the theoretical space between distributional semantics and topic modelling. Finally, we discuss social and historical meanings by looking at possibilities for analysing extra-linguistic contexts alongside linguistic data, within carefully annotated, structured data sets.

If that’s whet your appetite, you will find full abstracts for each paper–and for every paper in the Congress–on the main DHC site.

Last registration date is 7 September.