The Analytical Onomasticon Project An Auto-Ethnographic Vignette

As Russ Wooldridge pointed out many years ago, all too often “the computer disappears into the background” once its results are to hand (http://projects.chass.utoronto.ca/chwp/). This is especially true if those results fall short of expectations. In the following I describe the history of a project whose failure in those terms turned out to be far more important than its impossible success would have been. The moral of this story is that with persistence the futile struggle to conform works of the imagination to finite, algorithmic requirements is, or can be, transformational. To quote Italo Calvino, the encoder plays a game that if played long, hard, and well enough “finds itself invested with an unexpected meaning […] slipped in from another level” (1980/1966, §4).

The following is an ethnographic vignette of my own editorial practices during a digital research project on the Metamorphoses of Ovid that I began in the late 1980s and with help from research assistants worked on until 2004. 1 Its original purpose was to support writing of a conventional book on the Met, but work with my first assistant turned up such compelling problems of mismatch between computational form and poetic 1. For a detailed discussion see McCarty 1993 and1994. Although no longer in development, the Analytical Onomasticon can be accessed online, at http://www .mccarty.org.uk/analyticalonomasticon/. meaning that I abandoned the book and devoted myself to them. 2 By putting this experience into circulation here my aim is to keep the theorizing firmly grounded in real machinery and actual poetry while I attempt to describe what happens in the "contact zone" between them, when the strictures of the machine are rigorously applied to the poem. 3 Ovid's Metamorphoses is a large, elusively structured compilation of teasingly interrelated mytho-historical stories in 15 books, amounting to 12,000 lines of classical Latin hexameter. Since its composition ca. 100 ce the poem has had enormous historical, literary, and artistic influence, making it fundamental within the European cultural tradition. Nevertheless in modern times it has been poorly understood as a poem, often downgraded to little more than a convenient miscellany. When I became interested in it, literary scholarship on the Met was largely preoccupied with realizing "the immortal dream of a universal key", as one exceptional critic noted of the most ambitious attempt. 4 It and other such attempts did not succeed.
I wanted to do better. The bewildering combinatorial complexity of the poem's many narratives suggested to me that the computer might help by modelling how the poem worked to see what patterns might arise. My specific aim was to build a software tool which would allow someone interested in a given story to follow its interconnections with others in order to see how it is affected by its context. Familiarity with the poem left me in no doubt of the extent to which a story's context of relations powerfully informs it when it is read in this way, collocatively. Demonstrating the effectiveness of a collocative reading would thus show that the opposition between miscellany and unified poem misses the whole point of it.
I chose not to focus on the poem as a corpus of words, in the manner of a concordance, which was then the focus of computer-assisted "textanalysis". I thought the relation of lexemes to narrative interrelations too difficult for the technologies then (and perhaps now) available. Nor did I choose to focus on the stories themselves, which are in many cases exceedingly difficult to delimit individually and often nested within each other 2. The research assistant, who became my collaborator, was Burton Wright, then doctoral student -and genuine scholar of Latin poetry, an honor I will not claim -in the Department of Classics, University of Toronto. 3. I'm indebted to Amiria Salmond for suggesting this vignette. For the genre see Wright andMcCarthy 2005, 16-17, citing Geertz 1986, 374, whom I paraphrase. For "contact zone" see Pratt 1991and 1992. 4. Due 1974, 135 on Otis 1966Tarrant 1976. For later, more favorable scholarship see e.g., Barchiesi 2002, Fantham 2004, and Knox 2009 to a depth of 2 to 5 layers. 5 To delimit them with arbitrary precision would almost certainly prejudice the outcome by embedding too much idiosyncratic interpretation. I chose instead the names of persons, whether human or divine, since names are data and entail the narratives in which the named persons are found. Thus I called the intended tool an Onomasticon, a "book of names". For a variety of reasons, however, the category "name" quickly expanded from proper name (e.g., Tereus) to include all devices of language referring to a person, including personal attributes and effects (e.g., barbarus, arma auxiliaria). Persons could thus be grouped by how they were named and so provide, I thought, a reliable way of exploring their interconnections and so inferring relationships among stories. In consequence the number of names grew exponentially, from a few hundred to ca. 60,000, i.e., an average of 5 per line of poetry. Such density of naming confirmed the potential of the Onomasticon to embrace the entirety of the Metamorphoses.
To render these names computationally tractable I had to "tag" each one manually, i.e., insert computer-readable metatext that said for each, e.g., "here is a mention of person Tereus, named by lemma arma auxiliaria in category Attribute". 6 Despite my attempt to avoid interpretation (and so avoid prejudicing the outcome for other readers), a very large majority of tags made interpretation unavoidable: not only is ambiguity essential to poetry, but the poem itself also explicitly plays on ambiguities to subvert every ontology the reader is tempted to construct. Indeed, the poem almost constantly tempts with possibilities of closure only to dodge them. Typically it offers a close analogy from a given story to another, e.g., Actaeon's sighting of Diana in her bath to Semele's sight of Jupiter; then it offers a recognizable but more distant analogy, e.g., Tiresias striking the snake. By such propagation of analogies, each forcing revision of a developing pattern, a continually branching network of similarities and differences spreads out into the poem. The result, however, is not a single network, one for each reader or one that varies with each interpretative stance, rather a 5. For a representation of the nested structure, see http://www.mccarty.org.uk /analyticalonomasticon/base/narrative.html (18/4/17). 6. I invented the tagging metalanguage myself for two reasons: the Text Encoding Initiative (http://www.tei-c.org), now the generally agreed-upon encoding methodology, was at the time I began itself just beginning; and, more importantly, I wanted my metalanguage to arise from the necessities of the Met itself as I saw them (see McCarty 1994). I was assured by no less than Michael Sperberg-McQueen that my metalanguage could be algorithmically translated into TEI, should I so desire.
continual, unresolvable networking that, I am convinced, leapt from the poet's last, triumphal word, vivam ("I will live"), into the European literary and artistic tradition that it so profoundly affected. Deciding whether any given candidate is a person meant confronting the vexatious problem of "personification", i.e., the making and un-making of persons in a literary text by assigning to them ontologically anomalous attributes, 7 for example motion to a stone, sometimes in such a way that they become teasingly neither one thing nor the other, e.g., by referring to the earth by the phrase viscera terrae, somewhat in the manner of Jastrow's duck-rabbit -a "seeing as" (see Fig. 1  Personification is in fact a crucially unresolvable problem, but under the influence of the post-classical device of capitalization -a form of markup, one might say -we are apt to slight it, and so undervalue the poem's de-ontologizing force. An example will show how difficult, indeed reductive, deciding what is who can be. Consider the word bacchus, conventionally naming either the god associated with wine or wine itself. For such a standard mythological character the referent up to the story of Philomela in Book 6 is quite clear. But then, in that story, at the royal feast her father Pandion orders in honor of the tyrant Tereus, who wishes to marry her, […] bacchus in auro ponitur […]. (Metamorphoses 6.488f) Here, suddenly, the unambiguous god becomes wine, de-personified by the final two letters of ponitur, which make the verb passive -and the passage strictly untranslatable and untaggable. At one moment bacchus is the god who in auro ponit ("puts [something] into a gold [cup]"); in the next, bacchus is wine that ponitur ("is poured") into that cup. Thus the crux: what is to be done? In this instance (as in many, many others) I compromised, for bacchus deciding that wine is indicated, tagging the instance as an attribute of the god so that it would not be lost to the Onomasticon. But not only was that compromise untrue to the dynamical, experiential reading of these lines, it flagged the problem of deciding in general what conditions of context are required for a person to remain unaffected by a de-personifying, ontologically disruptive agent. We must consider inter alia how great the separation in syntactic or in semantic relations between name and agent must be and what effects surrounding words and stories might have. The complexities mushroom; I could cite many, many other examples. But my point should be clear enough: that attempting to translate the Metamorphoses into computationally manipulable form subjects the translator to these two familiarly opposed forces: on the one hand, truth to the poem, which must take into account numerous alternative, changing interpretations of the individual passage and of the poem as a whole; on the other hand, the imperative of reduction to algorithmic form. The computer demands an ontology; the poem de-ontologizes everything it touches. Crossing the digital Rubicon cannot be avoided, but much is lost (and gained) by doing so. There's always a trade-off. Possibly the best overall explanation of how computation fits into the making of knowledge is philosopher of science David Gooding's "Varying the cognitive span". "To digitalize", he writes, is to represent features of the world, including relationships between them, in a manner that establishes and fixes unambiguous meaning [. . . .] It is a method designed to achieve two things: to preserve the invariance of tokens in a symbol manipulation system and to make the value of the tokens unambiguous. (2003, 279 and 283 n33) In other words, it is to represent the object of analysis in completely explicit and absolutely consistent form. But, he goes on to explain, digitalization is rendered meaningful to humans both before and after it happens, as in Fig. 2, above.
In the physical sciences, the individual datum tends to be in itself simple and insignificant; what matters are the patterns detectable in very large amounts of data. In the human sciences patterns are of course equally important, but the objects of study yield relatively small amounts of idiosyncratic and complex data with high probability that the individual datum will prove a significant, if not a revolutionary anomaly, hence crucial to preserve. Preparatory reduction (e.g., by markup) therefore tends to be a far more serious and difficult affair. The loss due to reduction needs to be captured while the data are under the interpreter's cognitively computational microscope, then brought into play when the output of analysis is integrated back into the interpreter's world. Note that this particular "microscope" is indeed a very powerful tool of analysis; it illumines what the digital net does not, perhaps cannot catch.
What difference does digitalization actually make? Careful readers have always been philologically scrupulous, checking the textual data and integrating new insights into their interpretations accordingly. There are two differences. The more obvious, in consequence of the machine's manipulatory affordances, is scale, which is no simple matter. Rather it is transformative: "size is seldom just size", Franco Moretti comments, "a story with a thousand characters is not like a story with fifty characters, only twenty times bigger; it's a different story". 9 But the difference I am emphasizing here is micro-rather than macro-scopic. It is the radically reductive step of translation into digital form, and the driving force its revelations exert on the interpretative act.
Technologically the Onomasticon was a failure, but (unsurprisingly) I like to think of it as the illuminating kind. No question about the naivety of my original intentions. But with the help of that computational microscope, albeit in a hard, indirect way, I learned much about the poem, and crucially for subsequent involvement with computing, I absorbed from the Metamorphoses encouragement to see beyond that dream of a universal ontology to the poet's radically destabilizing design. On the literary side I emerged from the work more persuaded than ever that from first to lastfrom in nova fert animus to the concluding vivam -the poem is harnessing the reader's ontological hunger to generate ontologizing play, and thus guaranteeing the poet's vivam.
On the technological side, analytical markup, in which much hope was then (and is now) being invested, had already proved to be a dead end from all but the most pragmatic of perspectives, e.g., producing digital editions and facilitating publication. For the literary interpreter markup is, for one thing, simply too laborious even at the relatively modest scale of one poem by one author. (We tend to forget: the computer is a physical machine and we creatures in time, hence a cost/benefit analysis cannot be avoided.) But the failure that proved most rewarding was the equally negative realization that even if it could be completed the Onomasticon would be far too difficult for anyone else, or indeed for me, to change systematically and so, ironically, to adapt to shifting views of Ovid's shifting world. Too much 9. Moretti 2013, 169, quoted by Katanova et al. 2017. interpretation fossilized (in the true sense, not as an inactive metaphor is said to be) in too many tags involving too many unrecorded if not unrecordable decisions constituted the fatal blow to my ambition of constructing even a theory-minimal device.
We agree that no construction can be theory-free, but the making of the Onomasticon points to a somewhat different, redeeming conclusion: that the power of a tool arises from the engineer's agonistic interplay of design against constraint (Wulf 2000). For a computational tool, design in my sense is its responsiveness to the interpretative moves of the user-designer, moment by moment; constraint is provided by the imperative of complete explicitness and absolute consistency. Hence the core failure of analytical markup lies not in its rigidity but in the lack of responsiveness which that rigidity entails, its propositional rather than subjunctive, as-if form. As I just said and repeat here for emphasis, unlike human language, in which "fossilized" metaphors can come back to life at the poet's touch, its units of expression remain fossils from the moment of utterance. But methodologically the Onomasticon also points to Gooding's stage of cognitive integration, as he says to the "construals" which emerge from modelling, the "flexible, quasi-linguistic messengers between the perceptual and the conceptual [. . .] [which] assimilate the novel to the familiar" (1986,208). Much more attention is needed here.
King's College London