As those who follow our Twitter account will know, Linguistic DNA’s principal investigator, Susan Fitzmaurice, was among the invited speakers at the recent symposium on Digital Humanities & Conceptual Change (organised by Mikko Tolonen, at the University of Helsinki). It was an opportunity to set out the distinctive approach being taken by our project and the theoretical understanding of concepts that underpins it. What follows is the first of three blog posts based on extracts from the paper, aka the Linguistic DNA ‘manifesto’. Susan writes:
Linguistic DNA’s goal is to understand the ways in which the concepts (or paradigmatic terms) that define modernity emerge in the universe of Early Modern discourse. The methodology we are committed to developing, testing and using, i.e. the bottom-up querying of a universe of printed discourse in English, demands that we take a fresh look at the notion of a concept and its content. So how will we operationalise a concept, and how will we recognise a concept in the data?
Defining the content of a concept from above
Historians and semanticists alike tend to start by identifying a set of key concepts and pursue their investigation by using a paradigmatic approach. For semanticists, this entails identifying a ‘concept’ in onomasiological terms as a bundle of (near-)synonyms that refer to aspects of the semantic space occupied by a concept in order to chart conceptual change in different periods and variation in different lects.
Historians, too, have identified key concepts through keywords or paradigmatic terms, which they then explore through historiography and the inspection of historical documents, seeking the evidence that underpins the emergence of particular terms and the forces and circumstances in which these change (Reinhart Koselleck’s Begriffsgeschichte or Quentin Skinner’s competing discourses). Semanticists and historians alike tend to approach concepts in a primarily semasiological way, for example, Anna Wierzbicka (2010) focuses on the history of evidence, and Naomi Tadmor (1996) uses ‘kin’ as a starting point for exploring concepts based on the meanings of particular words.
Philosophers of science, who are interested in the nature of conceptual change as driven or motivated by scientific inquiry and technological advances, may see concepts and conceptual change differently. For example, Ingo Brigandt (2010) argues that a scientific concept consists of a definition, its ‘inferential role’ or ‘reference potential’ and the epistemic goal pursued by the term’s use in order to account for the rationality of semantic change in a concept. So the change in the meaning of ‘gene’, from the classical gene which is about inheritance in the 1910s and 1920s, to the molecular gene in the 1960s and 1970s which is about characteristics, can be shown to be motivated by the changing nature of the explanatory task required of the term ‘gene’. In such a case, the goal is to explain the way in which the scientific task changes the meaning associated with the terms, rather than exploring the change itself. Thus Brigandt tries to make it explicit that
‘apart from a changing meaning (inferential role) [the concept also has] an epistemic goal which is tied to a concept’s use and which is the property setting the standards for which changes in meaning are rational’ (2010: 24).
His understanding of the pragmatics-driven structure of a concept is a useful basis for the construction of conceptual change as involving polysemy through the processes of invited inference and conversational implicature (cf. Traugott & Dasher, 2002; Fitzmaurice, 2015).
In text-mining and information retrieval work in biomedical language processing, as reported in Genome Biology, concept recognition is used to extract information about gene names from the literature. William Baumgartner et al. (2008) argue that
‘Concepts differ from character strings in that they are grounded in well-defined knowledge resources. Concept recognition provides the key piece of information missing from a string of text—an unambiguous semantic representation of what the characters denote’ (2008: S4).
Admittedly, this is a very narrow definition, but given the range of different forms and expressions that a gene or protein might have in the text, the notion of concept recognition needs to go well beyond the character string and ‘identification of mentions in text’. So they developed ‘mention regularization’ procedures and disambiguation techniques as a basis for concept recognition involving ‘the more complex task of identifying and extracting protein interaction relations’ (Baumgartner et al. 2008: S7-15).
In LDNA, we are interested in investigating what people (in particular periods) would have considered to be emerging and important cultural and political concepts in their own time by exploring their texts. This task involves, not identifying a set of concepts in advance and mining the literature of the period to ascertain the impact made by those concepts, but querying the literature to see what emerges as important. Therefore, our approach is neither semasiological, whereby we track the progress and historical fortunes of a particular term, such as marriage, democracy or evidence, nor is it onomasiological, whereby we inspect the paradigmatic content of a more abstract, yet given, notion such as TRUTH or POLITY, etc. We have to take a further step back, to consider the kind of analysis that precedes the implementation of either a semasiological or an onomasiological study of the lexical material we might construct as a concept (e.g. as indicated by a keyword).
See the next post in this Manifesto series.
Pingback: Defining the content of a concept from below; or: How to recognize a concept in big data (pt. 2 of 3) | Linguistic DNA
Pingback: Operationalising concepts (pt. 3 of 3) | Linguistic DNA