Posted: 2021-08-14
Delivered at Queer in AI virtual social at EACL 2021.
In this talk, I tentatively apply the theories in Sara Ahmed’s Queer Phenomenology to the practice of spelling, and in particular “spell check” as a technology. In calling the talk “notes toward…” I’m giving myself a lot of leeway to merely suggest connections, without the responsibility of following up on those connections, or making an argument for the importance of those connections. I hope you’ll forgive me for that.
Also, I think it’s possible you might come away from this talk believing that I’ve claimed that spell check is “bad,” from an ethical and moral standpoint. I don’t intend to make a case for this one way or another, and my intention in giving the talk isn’t to condemn people who make use of spell check. It has virtues separate from the drawbacks I point out here.
I typed the word “transphobia” into Firefox the other day and noticed that it had a red line under it. That red line, of course, is the visual manifestation of Firefox’s “spell check” program. It’s how Firefox draws my attention to what it considers to be a mistake.
I understand that “transphobia” isn’t an every day word, for most people. For this reason, the string of characters I typed is, I suppose, unusual—queer, even. “Perhaps,” the algorithm says to me, speaking through this red underline, “You meant to type something else.”
The fix is easy: just right click and “add to dictionary.” And voila, the word “transphobia,” once strange, is now made familiar. The dotted line disappears.
According to the persdict.dat
file in my Firefox profile, these are the other words I’ve added to the Firefox spell check dictionary:
autoencoders
phoneme-to-grapheme
cartomancy
hermeneutic
intersectional
variational
ontologies
asemantic
hermeneutics
t-SNE
divinatory
transphobic
Altogether a pretty good summary of the academic, artistic and personal concerns I’ve had over the past few years. I’d already added “transphobic” but hadn’t added “transphobia,” for whatever reason. Each of these words are instances in which my experiences and desires were illegible to the machine, and I had to take action to align the machine to those experiences and desires.
Alexandra Jaffe and Shana Walton point out that the form of the written word is not arbitrary or neutral:
[T]he form of the written word… carries a social and symbolic load: that orthography [is] conventionally associated with social, cultural and linguistic identities and hierarchies. These associations of identity and power explain why no transcriptions of speech—even ‘close phonetic’ ones—are ever neutral or transparent depictions of what someone said or how they sounded. Transcriptions reflect representational choices that encode ideological stances towards any number of aspects of the speech event. […] In fact, orthography is one of the key sites where the very notion of ‘standard language’ is policed…1
Spell check is one of many technologies that enforce the notion of “standard language,” and make us aware of how the language we use may not be “standard.” (In the form of that little red line.)
The problem for a spelling checker is quite simple: Given an input file of text, identify those words which are incorrect. A spelling corrector both detects misspelled words and tries to find the most likely correct word.2
Spell check is one among the earliest applications of artificial intelligence. Peterson’s survey3 cites a program called SPELL for the DEC-10 as the earliest “spelling checker written as an applications program (rather than research).” SPELL was written in 1971 by Ralph Gorin, then a student at Stanford Artificial Intelligence Lab. Spell check is also perhaps one of the most pervasive and familiar applications of artificial intelligence, being present in some form in nearly every digital writing interface, from word processors to e-mail clients to search engine text fields.
Spell check can be undertaken with any number of computational techniques, from simple counts of unique tokens to dictionary lookups to deep learning language models. Regardless of the approach, however, according to Peterson, “The problem for a spelling checker is quite simple: Given an input file of text, identify those words which are incorrect. A spelling corrector both detects misspelled words and tries to find the most likely correct word.”
I’ll come back to that word—“correct”—in a moment.
What I want to talk about first is that underline—the underline that appeared beneath “transphobia.” I’ve been thinking about its shape and its form, and about the concept of the “line” in general.
In Queer Phenomenology, Sara Ahmed’s primary concern is the concept of the line: lines as borders on maps, lines as in “family lines,” lines of work, lifelines. Ahmed writes (partially paraphrasing Judith Butler) that
it is lines that give matter form and that create the impression of surface, boundaries and fixity. […] The lines that allow us to find our way… also make certain things, and not others, available…. When we follow specific lines, some things become reachable and others remain or even become out of reach. […] We might say that we are orientated when we are in line. We are “in line” when we face the direction that is already faced by others.4
The line is also Ahmed’s primary device for investigating the idea of sexual orientation. Later in the book, she writes:
The same-sex orientation thus deviates or is off course: by following this orientation, we leave the “usual way or normal course.” Conversely, heterosexual desire is understood as “on line,” as not only straight, but also as right and normal, while other lines are drawn as simply “not following” this line and hence as being “off line” in the very direction of their desire.5
The word “queer,” as Ahmed notes, comes from the Proto-Indo-European root *terkʷ-, meaning “turn” or “twist.” This, according to Ahmed, “gets translated into a sexual term, a term for a twisted sexuality that does not follow a ‘straight line,’ a sexuality that is bent and crooked.”6 Queer desire, she adds, “reaches objects that are not continuous with the line of normal sexual subjectivity.”7
For this reason, it doesn’t seem like a coincidence to me that text interfaces indicate “incorrect” or “queer” words with wavy, squiggly, or discontinuous underlines. The exact design varies from one piece of software to the next, but the idea is the same. The “turned” and “twisted” line indicates the words that are “queer.”
(The passage here, of course, is from Lewis Carroll’s Jabberwocky.)
“Correct” words—or perhaps “straight” spellings, in the analogy I’m suggesting here—are considered to be the default, and thus require no special marking.
That root, *terkʷ-, also gives us the English words torque, torsion and torment, all derived from Latin. The word tormenta in Latin refers in general to large siege engines like catapults and ballistas. The name comes from the fact that these engines were powered with twisted cords, generally made of animal sinew, which would store up kinetic energy when twisted. The wavy, squiggly underlines of spell check also remind me of the twisted up cord in this illustration of a catapult’s skein.8
There’s something interesting to me about the idea of a “misspelled” word being “twisted” up, full of kinetic energy that can later be released. We’ll see some “misspellings” with this kind of energy later.
This connection may help explain Google Docs’ method of marking this same passage, which is to mark the whole thing not as an orthographic mistake, but as a syntactic one. Google here is telling us not that our spelling is queer, but that our syntax is tortured.
As an aside, I think it’s interesting that a fair etymological translation of the phrase tortured syntax in English might be something like “queer straightening up.”
Another way to interpret the squiggly underline of spell check is as a kind of desire line—a physical trace that shows evidence of how physical bodies tend to traverse a space, which is often very different from how the built environment was designed to direct that traversal.
Likewise, the squiggly line under a “misspelled” word is a trace. Sometimes a misspelling is primarily a record of a physical body: human fingers, at a particular moment in time, depressing buttons on a keyboard, following (successfully or unsuccessfully) a kind of score (in the form of a letter sequence). Sometimes a “misspelling” is the result of my linguistic subjectivity—say, the result of my having phonetically transcribed my own dialect. Misspellings are also records of social subjectivity: sometimes when the squiggly line shows up, it’s because I typed the word that I meant—like “transphobia”—even when that word names a concept that isn’t familiar to the culture around me.
Though the word processor marks these misspellings as mistakes, you can also interpret them as signposts. “Deviation,” writes Ahmed, “leaves its own marks on the ground, which can even help generate alternative lines, which cross the ground in unexpected ways. Such lines are indeed traces of desire; where people have taken different routes to get to this point or to that point.”9
Because misspellings are material, correcting those spellings is a way of separating linguistic behavior from its material history, in favor of an abstraction. Ahmed writes about this process more generally in her discussion of sexism in phenomenology, saying:
The object is “brought forth” as a thing that is “itself” only insofar as it is cut off from its own arrival. So it becomes that which we have presented to us, only if we forget how it arrived, as a history that involves multiple forms of contact between others. Objects appear by being cut off from such histories of arrival, as histories that involve multiple generations, and the “work” of bodies…10
Paraphrasing this for language, I might write instead:
The word is “brought forth” as a thing that is “itself” only insofar as it is cut off from its own arrival. So it becomes that which we have presented to us, only if we forget how it arrived… Words appear by being cut off from such histories of arrival, as histories that involve… the “work” of bodies…
This, to me, almost precisely describes the process of collecting large corpora of text as training data for machine learning models. The material history of the text is discarded in favor of an abstraction—a one dimensional straight line of unique token IDs.
Let’s return to Peterson’s definition of spell check for a moment. “The problem for a spelling checker,” Peterson, “is quite simple: Given an input file of text, identify those words which are incorrect. A spelling corrector both detects misspelled words and tries to find the most likely correct word.”11
I’d point out here that the word “correct” is not neutral in any sense. It means, etymologically, to “make straight”—or maybe more precisely, to make “right.” In a discussion of the distinction between “right” and “left,” Ahmed points out that
the distinction between right and left is not a neutral one…. [While] the etymology of the word left is “weak and worthless,”… [t]he right is associated with truth, reason, normality and with getting “straight to the point.” […] [T]he right becomes the straight line, and the left becomes the origin of deviation.12
In a sense, to “correct” misspellings is to “straighten” the queer.
Moreover, to distinguish between “correct” and “incorrect” is to set up an invented taxonomy that hinders an understanding of how spelling actually works. Again, I’ll make an analogy about spelling based on Ahmed’s arguments about queerness. She writes:
We can, of course, point to the invented nature of all differences, including the differences that are created by the line that divides the sexes. But what is needed is an even more fundamental critique of the idea that difference only takes a morphological form… and that such morphology is, as it were, given to the world. […] [T]he idea of sexes as “opposites” is what makes heterosexuality as it is conventionally described—itself the negation of the alterity of (other) women.13
I claim that just as the idea of the sexes as “opposites” negates the alterity of individuals within those sexes, the idea of “correct” and “incorrect” spelling as “opposites” negates the alterity of orthographic practices within those categories. In other words, the squiggle does not distinguish between different types of misspelling, only on whether or not something is considered to be misspelled. (Likewise, the absence of a squiggle fails to point to the fact that “correct” spelling is itself historically contingent and changes in response to the goals and values of those in power.)
This is important, because spelling itself is expressive and shaped by context. Spelling is something that we do tactically in every situation in which we have to produce text, in order to achieve particular communicative and social ends. Spelling, writes Androutsopolous in a wonderful paper about expressive non-standard spelling in German fanzines, is “a means to convey sociocultural stances and contextual meanings” that is nonetheless “motivated by creativity, i.e., the playful moment of language use.”14 Androustopolous lists the following forms of expressive spelling, but I bet you can think of others:
Spell check levels all of these to one category—“misspelling.” “Correcting” these misspellings, of course, removes vital information from the text!
The genres of nonsense poetry and sound poetry are extreme applications of spelling’s expressiveness. As an example, I include here an excerpt from Baroness Elsa von Freytag-Loringhoven’s amazing sound poem “Duet: Eigasing Rin Jalamund.”15
Aggnntárrr—nnjarrré—knntnirrr —
Eigasing—kjnnquirrr!
Hussa—juss—huss—jalamund —
Mund—avnurrr!
Narré—tnarrr—tarrr
Ornaksin—eigasing—lahilü!
Lihüla—halljei—alsüiii —
Jalamund—mund arrrljö-i-tüüü!
Oöö—ööö—acktasswassknox —
Orljfö—eigasing—ornimächtu!
Jass—hass—wass must—
Mustjuamei—jalamund—mund odajmi!
In these genres (nonsense poetry and sound poetry), the question of “correct” and “incorrect” spelling doesn’t even arise, as all of the words are all neologisms. What counts is their visual and auditory materiality.
My own work in computer-generated poetry recently has made use of a grapheme-to-phoneme-to-grapheme model that I designed and trained, called Pincelate. This model predicts articulatory phonetic features from English orthography, and also predicts English orthography from sequences of articulatory phonetic features.
One of the things I’ve made with the model is a sequence of poems called “Compasses,” in which I predict spellings from the midpoint of the line connecting the hidden state vectors of two or more “real” words belonging to supposedly discrete taxonomies. The goal here is to use a “spell check” model not to “correct” language, but instead to find new, queer spellings in between the “straight” spellings.
On the left is a photograph of how these poems look on the physical page. Still, when I lay out these poems in a word processor, it’s not shy about telling me what it thinks!
Spell check is, I think, an example of what Ahmed calls a “straightening device,” a method to bring under control the queerness of spelling—by which I mean, spelling’s material, contextual, expressive and sensual properties. Spell check, as a technology, draws attention to the “queer” and asks us to either modify or assimilate that queerness in order to conform with its demands of “correctness.”
What I might suggest is a “queer orientation” toward spelling. Ahmed writes:
Queer orientations are those that put within reach bodies that have been made unreachable…. Queer orientations might be those that don’t line up, which by seeing the world “slantwise” allow other objects to come into view. A queer orientation might be one that does not overcome what is “off line,” and hence acts out of line with others.16
In this view, that red squiggle—far from being a sign of inadequacy—is actually a good omen, a desire line that shows you’re on the right track.
The question before us as researchers in machine learning and artificial intelligence is whether or not machine learning and artificial intelligence as a whole will act as a straightening device, or whether we can put it to applications that instead celebrate the full variety of human experience—like, say, how orthography expresses embodied linguistic intention.
Jaffe, Alexandra, and Shana Walton. “The Voices People Read: Orthography and the Representation of Non-Standard Speech.” Journal of Sociolinguistics, vol. 4, no. 4, Nov. 2000, pp. 561–87.↩︎
Peterson, James L. “Computer Programs for Detecting and Correcting Spelling Errors.” Communications of the ACM, vol. 23, no. 12, Dec. 1980, pp. 676–87.↩︎
Peterson, James L. “Computer Programs for Detecting and Correcting Spelling Errors.” Communications of the ACM, vol. 23, no. 12, Dec. 1980, pp. 676–87.↩︎
Ahmed, Sara. Queer Phenomenology: Orientations, Objects, Others. Duke University Press, 2006, pp. 15-16.↩︎
Ibid., p. 70.↩︎
Ibid., p. 67.↩︎
Ibid., p. 71.↩︎
Illustrations from Payne-Gallwey, Sir Ralph. The Crossbow, Mediæval and Modern, Military and Sporting: Its Construction, History and Management, with a Treatise on the Balista and Catapult of the Ancients. Longmans, Green and Company, 1903, pp. 293 and 298.↩︎
Ahmed, p. 20.↩︎
Ibid., p. 41.↩︎
Peterson, James L. “Computer Programs for Detecting and Correcting Spelling Errors.” Communications of the ACM, vol. 23, no. 12, Dec. 1980, pp. 676–87.↩︎
Ibid., pp. 13–14.↩︎
Ibid., p. 99.↩︎
Androutsopoulos, Jannis K. “Non-Standard Spellings in Media Texts: The Case of German Fanzines.” Journal of Sociolinguistics, vol. 4, no. 4, Nov. 2000, pp. 514–33.↩︎
Gammel, Irene, and Suzanne Zelazo. “‘Harpsichords Metallic Howl—’: The Baroness Elsa von Freytag-Loringhoven’s Sound Poetry.” Modernism/Modernity, vol. 18, no. 2, Aug. 2011, pp. 255–71.↩︎
Ahmed, p. 107.↩︎