Nothing survives transcription, nothing doesn’t survive transcription

Allison Parrish

Posted: 2023-05-08

This is the text of a lecture I delivered at Iona University’s Data Science Symposium in April, 2023. The text has been lightly edited for clarity and consistency, and I omitted from this text a portion of the talk as delivered that concerned my own work.

Hello everyone. My name is Allison Parrish. I’m a poet and computer programmer and an Assistant Arts Professor at New York University’s Interactive Telecommunications Program/Interactive Media Arts program. My talk today is about transcription—how text comes to be. My goal is to trouble your understanding of what transcriptions are, how transcriptions work, and the stakes of this understanding (with particular reference to large language models).

The most crucial part

The original impetus for this talk came from my colleague Jordan Magnuson’s forthcoming book Game Poems, which I read earlier this year. Jordan sent me a manuscript along with a request for a blurb. The manuscript was excellent, and I was happy to oblige his request—it’s a great book! But there are a few passages in the book where Jordan and I have some disagreements.

In particular, there’s a passage where Jordan is working through the question of what the “material” of videogames is. He likens the platform on which a videogame is published to the venues where a poet might choose to distribute their work, and more specifically the poem’s “material manifestation, how and where letters and words appear on the physical surface” (Magnuson). Jordan allows that “these details [of the poem] matter,” but they are on “the periphery of the art, since the most crucial part of most lyric poems can survive a transcription.” He adds: “If they could not, then my copy of The Complete Poems of Emily Dickinson contains no poetry at all, in which case we are lost.”

This passage occurs as part of an argument that it’s possible to port videogames between platforms without changing the underlying game design (an argument that I ultimately agree with). However, I think that Emily Dickinson is—for reasons I’ll get into shortly—an inopportune choice as an example of a poet whose work “can survive a transcription.” In fact, I think Emily Dickinson’s work is an example of how, in fact, nothing survives transcription, and that’s the statement that I set out to prove in this talk. But along the way, it occurred to me that the converse is also true: nothing doesn’t survive transcription, and that the tension between these two seemingly opposite claims actually informs and explains them both.

What is transcription

By “transcription” I mean the result of adapting some stretch of language from one medium to another, in such a way that the adapted version is understood to have the same “content” as the “original.” Maybe more precisely: a linguistic artifact A is a transcription of a different linguistic artifact B if B precedes A causally and temporally, and A and B are understood to be identical in meaning, though they differ in material form.

The prototypical example of a transcription is a “transcript”—a written artifact that records the “content” of a stretch of language that was spoken out loud. And indeed, I’ll be talking about transcripts of this kind in more detail later. But I think the term “transcription” usefully applies to adaptations of language between any two modalities. For example, producing a typewritten copy of a handwritten manuscript is a kind of transcription. Taking notes on a lecture is a kind of transcription. Under this definition, even my verbal performance of this talk (reading from my speaker notes) is a variety of transcription.

Since we’re gathered together today as students and professionals with the word “digital” in our job titles or the names of our academic programs, we might consider in special detail the nature of transcriptions in a digital context. Examples of digital transcription are many: I include the act of keying in documents to be a kind of transcription, but also applying optical character recognition to a printed text. Automated text-to-speech is a kind of transcription, as is converting a document from one format to another (say, converting a Keynote file to a PowerPoint file). Even cutting and pasting text from one window to another counts, I think, as a kind of transcription.

(My focus in this talk is on language, since that’s my particular interest and expertise. But I think that this definition of “transcription”—and the arguments I’m going to make—can usefully apply to other media as well.)

Folk theory of transcription

There is what I call a “folk theory of transcription,” which is that transcription is, for the most part, a transparent process that mostly “just works,” and that a transcript of a stretch of language, or a digitized version of a text, is more or less the “same thing” as the original. This theory furthermore supposes that a transcription’s failure to reflect any given aspect of the original stems from either (a) the triviality of that aspect, or (b) an insufficiently “close” transcription process.

This folk theory, I think, is what underwrites the claims of large language models, like ChatGPT, which are mostly trained on plain text transcriptions of documents. The claim that these models can produce meaning relies on the assumption that plain text transcriptions of documents contain more or less the same “content” as their originals.

Transcription: From a typeset page, to plain text, to a one-dimensional array of integers

The premise of large language models, after all, is that any text can be represented as a one-dimensional array of integers that correspond to unique identifiers of token types. This is a paradigmatic example of what I’ve elsewhere called “cistextuality”: the insistence that texts are static and that meaning inheres exclusively within the text, without reference to its context. In the cistextual imagination, texts can be perfectly transcribed, and there is, in fact, no crossing—no “trans”—in a “transcription.” This is (as I hope to demonstrate) not a very rich or very useful way of understanding texts.

The many institutions of Dickinson’s manuscript transcriptions

With all of that in mind, I want to go back to Emily Dickinson. Dickinson, for context, is a 19th century American poet, widely considered to be one of the most important figures in American literature. She wrote nearly two thousand poems, only a handful of which were published during her lifetime. The rest were published posthumously, first by family members who found her trove of poems in among her belongings after her death, and later by scholars who came into possession of her manuscripts. You’re most likely familiar with typeset versions of Dickinson’s poems like this one (Dickinson), maybe from your high school English textbook.

“I taste the liquor”

But in reality, the poems in her manuscript looked nothing like this. In fact, Dickinson prepared no manuscript for her poetry that would be considered amenable to typesetting. She wrote poems on “scraps of envelopes, notes, discarded papers”; fragments “collected to house poetry in a tangible way; the poems follow the shape of the paper, and lines shift to fit their borders” (Pipkin).

Some of these scraps were collected into hand-sewn books that Dickinson called “fascicles” (others were not). Throughout her work, Dickinson assiduously followed her own unusual conventions for capitalization, punctuation, spacing, and line breaks. Susan Howe, in The Birth-mark, comments on the fluidity of Dickinson’s form: “The trace of her unapprehended passage through letters disturbs the order of a world where commerce is reality and authoritative editions freeze poems into artifacts” (Howe 19). Her writing “is a pre-meditated immersion in immediacy. Codes are confounded and converted. ‘Authoritative readings’ confuse her non-conformity” (Howe 139).

“The way hope builds his house” Image source

Dickinson’s punctuation is of particular concern to Howe and other scholars. She often eschewed periods, commas and semicolons in favor of dashes of varying lengths and directions, along with intentional variations in the spacing between words. Her earliest editors simply eliminated these, considering them anomalies, and typeset Dickinson’s poems according to poetic conventions of the time. But since the 1960s, there has been a simmering debate about the precise meaning of these marks and how they should be transcribed. Edith Wylder has proposed, for example, that these marks are “a form of punctuation,” of Dickinson’s invention, directed “to the reader’s inner ear.” The marks, she claims, register tonal modulation, pitch, and breath. Other scholars have suggested that Dickinson’s punctuation is “another means by which [she] takes us backstage to view the struggle of poetic process, a struggle to find the right word” (Wylder 210).

In the face of all this, Susan Howe concludes that there is no editorial approach to Dickinson that could be truly faithful to her poetic innovations. “Words are only frames. No comfortable conclusions. Letters are scrawls, turnabouts, astonishments, strokes, cuts, masks.… [Dickinson’s] manuscripts should be understood as visual productions. […] I think her poems need to be transcribed into type, although increasingly I wonder if this is possible.” I read Howe here as claiming that, in fact, the “crucial part” of Dickinson’s lyric did not survive transcription. Howe underlines the impossibility of survival with a political argument: “The production of meaning will be brought under the control of social authority” (Howe 140). (Howe also doesn’t hesitate to point out that although Dickinson’s poems themselves are in the public domain, you must pay a fee to quote the transcriptions of those texts.)

When I say that “nothing survives transcription,” what I mean is that when transcription happens, the result is a crime scene—a murder, which leaves behind a body. The word we use for a text dataset is, after all, “corpus”: Latin for “body,” cognate with the English word “corpse.” But maybe that’s too violent a metaphor. Instead, we might think of transcription as a kind of hospice: an institution whose role is to guide a stretch of language from this life to the next.

And transcriptions are institutions. We see this with Emily Dickinson’s work. The questions surrounding how to transcribe this text has breathed life not just into scholarship like Susan Howe’s, but also new editions of Dickinson’s work designed to better represent her linguistic innovation, including photographic facsimile editions. The institution of Dickinson’s transcription has also occasioned works of art, such as Jen Bervin’s The Dickinson Composites, a series of large-scale embroideries that reproduce Dickinson’s variant punctuation, leaving out the accompanying words.

The Dickinson Composites Image source

Lorenza Mondada, in a paper about conversational transcripts, notes that “a transcript is an evolving flexible object; it changes as the transcriber engages in listening and looking again at the tape, endlessly checking, revising, reformatting. These changes are not simply cumulative steps towards an increasingly better transcript: they involve adding but also subtracting details for the purposes of a specific analysis, of a particular recipient-oriented presentation” (Mondada). Mondada is talking about individual transcribers working on individual conversations, but I think that this argument applies as well to the institutions of transcription that arise around particular works, or bodies of work, like Emily Dickinson’s. Transcriptions evolve and flex. Nothing survives transcription, but the transcription itself is definitely alive.

Before we move on from Dickinson, I want to quote one of the Dickinson scholars that I read to prepare for this talk. Edith Wylder draws a conclusion much like Howe’s, writing that the “subtle refinements” of Dickinson’s punctuation “are lost in transcription” and that the “failure to transcribe these notations accurately… blur[s] the accent that distinguishes her persona and her story.” But then she makes an appeal: “Surely the infinite resources of modern technology will permit a more accurate transcription of Dickinson’s accentual notations” (Wylder 221–22). Now, Wylder was writing twenty years ago, long before the decidedly limited nature of technology’s resources was widely recognized. But I think this kind of technosolutionism is still prevalent when it comes to how we think about transcriptions.

Even in this crowd of data scientists, I think that if you reach deep down in your heart, you might believe that there is—there must be—something “essential” about a text that is maintained through transcription. You might believe, as Wylder pleads, that through the power of technology we will someday be able to represent any text with a transcription that is 100% faithful to its original. Or that transcription processes are merely “biased,” and that with enough time and skill and diligence, that bias might be tipped true eternally. If this is what you think, I hope I’ve already dissuaded you a little. But to better make my point, I’m going to dip into scholarship surrounding another field of study that involves transcription—conversation analysis.

A lull in the conversation

Conversation Analysis (CA) is a sub-field of sociolinguistics concerned with how conversations are structured. Practitioners in the field draw their conclusions from analysis of detailed conversational transcripts, and as a consequence, the critical literature about transcription is very developed in this field. I want to show a few examples of conversation analysis and relay some scholarship in this field that will, I hope, further illustrate that nothing survives transcription. In particular, I’m interested in dispelling the idea that there is such a thing as an objectively “accurate” transcript that “preserves the essential aspects” of some stretch of language. This also leads into the second sentence of this talk’s title: nothing does not survive transcription.

I’m also interested in talking about conversation because, for better or worse, big tech corporations have decided that user interfaces for their products (such as large language models) should mimic open-ended conversation. Instead of, say, typing a search term into a text field, clicking on links, flipping switches or moving sliders, we are now called upon to engage in conversation with a chatbot in order to make our computers do things. So it’s important to understand the limits of computational models of conversation.

Practitioners of conversation analysis make very detailed transcripts from recorded conversations. A typical transcript in conversation analysis might track not just the words that were said, and who said them, but also intonation, emphasis, volume, breathing, and fine-grained information about the timing of conversational turns. These transcripts track when one conversational turn interrupts another, or when two turns overlap, or if the turns are perfectly latched; they also indicate if there was a pause within or between turns. This example transcript (Jefferson, “Glossary of Transcript Symbols with an Introduction”) includes some common conventions for transcripts in CA.

Example transcription

The underlines indicate stress; turns are lined up according to how they were overlapping one another, and bracketed portions in adjacent lines indicate that these speech events happened simultaneously. Pauses are indicated with parentheses—we’ll talk more about this below. Compare a transcript like this to your typical automated Zoom transcript, or a transcript of a podcast, and you’ll find them lacking in the level of detail present. (It’s important to note that, e.g., podcast transcripts are much, much more likely to be present in the corpora of large language models than a Jeffersonian conversation transcript like the one above.)

In her paper “The Politics of Transcription,” CA researcher Mary Bucholtz presents a striking example of how the decisions of the transcriber affect how the transcribed discourse is perceived. In this image, we see two versions of a transcription of a panel discussion in the 1990s concerning the acquittal of the police officers charged with brutality against Rodney King. The transcript on the left is from a newspaper; Bucholtz performed the transcription on the right, using common conventions in CA. Bucholtz focuses on the conversational turns of one participant in particular (labelled JM), who is Black:

Snippet of a transcript

You can see that the newspaper reporter’s transcript is a highly editorialized take on the speech event in question, lacking many of the important details captured in the transcript on the right. Bucholtz writes: “It is immediately apparent that JM’s turn has been sizably reduced in the newspaper’s version, due to the omission of discourse markers that he uses to structure his turn… [and] the omission of over sixty lines of speech. Their exclusion reduces not only the space allotted to JM… but the coherence of his discourse as well.” These decisions on the part of the newspaper transcriber makes JM appear “less rational and logical” than JM seems in her own transcription (Bucholtz 1447–50).

But attention to minute detail isn’t some magic formula for producing transcriptions that are objectively accurate. In the process of transcription, all transcribers are constantly making decisions about what counts as a detail worth recording, and what doesn’t. Some of these decisions are explicit, and are made in an attempt to conform to some methodological or editorial constraint. Others are made implicitly, at a level below our own consciousness. Bucholtz writes:

All transcripts take sides, enabling certain interpretations [and] advancing particular interests.… The choices made in transcription link the transcript to the context in which it is intended to be read.… Transcripts thus testify to the circumstances of their creation and intended use. As long as we seek a transcription practice that is independent of its own history rather than looking closely at how transcripts operate politically, we will perpetuate the erroneous belief that an objective transcription is possible” (Bucholtz 1440).

Susan Howe, writing on transcriptions of Dickinson, wrote “The production of meaning will be brought under the control of social authority.” Bucholtz has her own way of phrasing this: “Because transcription is an act of interpretation and representation, it is also an act of power” (Bucholtz 1463).

Bucholtz, Jefferson and Mondada above are all concerned with transcripts of spoken conversation. Again, the inductive leap I’m hoping you’ll make for me is to consider how these arguments apply to all forms of transcription, according to our original definition.

I want to draw attention to one aspect of conversation analysis in particular: the precision with which pauses are transcribed. In the transcriptions I showed a moment ago from Jefferson and Bucholtz, these pauses are indicated as numbers in parentheses (e.g., (1.8)). In conversation analysis, the length of pauses is often recorded at a resolution of one tenth of a second.

This attention to the duration of pauses is necessary because—as language speakers and conversationalists—we are incredibly attuned to pauses. Minute variation in conversational pause length can mean the difference between appearing to be thinking about your response, appearing to not having been paying attention, exercising discretion, or simply having nothing to say (McLaughlin and Cody). Gail Jefferson notes that silences longer than 1.2 seconds often indicate disagreement, misunderstanding, or rejection (Jefferson, “Preliminary Notes”). Some cultural conversation styles seek to minimize gaps in talk at all costs, while others allow for a greater tolerance of silence (Mushin and Gardner), an observation most easily made with conversational transcripts that track pauses. Robin Wooffitt, in The Language of Mediums and Psychics, gives a number of striking examples of how conversational pauses after topic-initiating questions in psychic readings are an indication of trouble communicating with the afterlife, as in the transcripts I’ve included here (Wooffitt 137–38). (M: or PP: indicate the psychic, S: indicates the sitter.)

Screenshot of several conversations between mediums and sitters

Wooffitt notes that “whereas positive confirmations are produced without delay, rejections or cautious responses are withheld momentarily,” which then requires conversational damage control on the part of both interlocutors (ibid). All of this leads me to conclude that transcriptions must perform what seems like a paradoxical combination of tasks: they must account for not only what is present (i.e., the words that are said), but also what is not present (i.e., the pauses between them).

I would guess that one of the reasons that “conversations” with machine learning agents feel stilted, or fall into the uncanny valley, is because that nothing—the nothing of the conversational pause—doesn’t survive transcription, or at least it doesn’t survive in the conversational transcriptions that language models most commonly have in their data sets. It is certainly possible to transcribe silence in ways that are visible to computation through annotation—text standards like TEI have elaborate methods of doing just that. In an attempt to address this problem in a more automated fashion, machine learning researchers have developed their own systems for annotating existing corpora with conversational pauses (Chang et al.), alongside methods for approximating conversational turn-taking using punctuation already present in the dataset (Żelasko et al.), or proposed methods for extracting conversational structure from multi-modal datasets including video (Han et al.).

Latency and the art of nothing

But I think that these efforts miss the point. What is considered “nothing” and what is considered “something” is always going to result from choices that the transcriber makes (along with the material properties of the technologies and conventions used to create the transcription). There will always be a nuance—a potentially important nuance—in something as delicate as a conversational pause that cannot be captured or reconstructed in the transcription.

Speaking of missing the point, I’m reminded of what John Cage said about the audience for the first performance of his famous piece 4’33”, which consists of four and a half minutes of silence. The audience, he says, “missed the point. There’s no such thing as silence. What they thought was silence, because they didn’t know how to listen, was full of accidental sounds. You could hear the wind stirring outside during the first movement. During the second, raindrops began pattering the roof, and during the third the people themselves made all kinds of interesting sounds as they talked or walked out” (Cage and Kostelanetz 97).

The appearance of “nothing” in artistic works is often an appeal to look outside systems of transcription. The text of Cage’s piece (at least in certain editions) is simply the word “TACET,” written once for each of the piece’s three movements. But the experience of the piece—the “accidental sounds”—can’t be scored, or abstracted, or understood to be identical with any other experience. For me, the point of 4’33” is that “nothing” is always actually material. “Nothing” exists in real space and real time. In drawing attention to “nothing,” artists draw attention to this irreducible—and untranscribable—materiality.

In computation and machine learning, the experience of “nothing” in time has a name: latency. Latency refers, roughly, to the period of time between when you submit a request to a system and when you receive a response. John Cage composed a piece that consisted of only latency—four and a half minutes of it—but machine learning engineers are always attempting to eliminate latency, or minimize it, or otherwise draw your attention away from latency. In the ChatGPT interface, one method they use to draw attention away from latency is a waiting animation that looks like this:

Animated dot-dot-dot

When machine learning engineers engage in these behaviors, they’re drawing attention away from exactly what Cage was drawing attention to: the physical, material underpinnings of “nothing.” To borrow Cage’s wording—“if we knew how to listen”—the sound we might hear under the churning of this dot-dot-dot would be the sound of the datacenter where the model is running.

Of course, John Cage isn’t the only artist to have made work specifically to test the edges of what is and is not nothing—and, by extension, what is and is not subject to transcription. A favorite work of mine that falls into this category, and a specifically literary example, is the poet Aram Saroyan’s untitled 1968 book. This work was the third “book” in Lita Hornick’s Kulchur Press series, which had previously featured work by Andy Warhol and Ted Berrigan, among others. The work consists of a wrapped ream of blank, letter-size typing paper.

A ream of paper Image source

This work is obviously experimental, and more than a little tongue-in-cheek. Craig Dworkin, in No Medium, plausibly argues that this work is not a “sculpture,” but “a logical and legitimate extension of [Saroyan’s] writing practice” (Dworkin 17). If this is a literary work, though, we have to wonder: what would a transcription of this work look like? Would this “nothing” meaningfully survive such a transcription?

Another relevant untranscribable literary work is Michael Asher’s contribution to the first issue of Tom Marioni and Kathan Brown’s journal Vision in September 1975. Craig Dworkin describes this work in his book No Medium: “The table of contents lists Asher’s work, indicating that it begins on page 42; pages 42 and 43, however, are joined with an adhesive and so even the absence of any text or image on Asher’s page has been obscured” (Dworkin 19).

e7843a3071d6ac0059f8e5728528270b.png

Notably, the PDF that I found for this issue of Vision online silently elides pages 42 and 43! This nothing quite literally did not survive the process of (digital) transcription.

We might also consider the work of Glenn Ligon, especially his untitled prints from 2016. These works consist of pages from James Baldwin’s essay “Stranger in the Village,” which have been manipulated and covered with pigment as a way to partially obscure them. These works are textual in nature, but they resist being read.

Glenn Ligon, Untitled, 2016 (detail)

Image source

Kinohi Nishikawa, in his essay “Black arts of erasure,” says (quoting art historian Darby English) that imposing a “social referent” on Ligon’s work “limits his pieces ‘to a textualization that conflates their painted aspect with their written one,’ thus ‘reducing each [aspect] to so many marks for a reading’… leaving viewers to wonder why the “painting obstructs or upsets” interpretation. […] Ligon does not follow a teleology of picturing so much as explore the dialectical interplay between word and image—how the meaning of language can shift depending on where and how one is conducted to view it” (Nishikawa).

In other words, the missing text—the “nothing”—in this work is an important part of the work’s poetic machinery. That nothing would not survive a straightforward transcription—especially the kind of one-dimensional transcription employed by large language models. In each of these cases, “nothing” is being deployed by these artists to draw attention to precisely those aspects of literary works that cannot be transcribed—material, social, conceptual—because they exist outside the system of transcription.

Are we lost?

So: nothing survives transcription, in the sense that no text makes it to the far side of the transcription process with its life intact. And also, nothing does not survive transcription: the empty parts of a text, the silent parts, the parts of the text that draw attention to its own materiality, specifically operate outside transcription’s capabilities. And all of us—whether as artists, poets, or everyday conversationalists—draw on the “nothing” that forms the gap between what can be transcribed and what cannot as a productive and creative resource.

But we can also look at this from the other direction and recognize that, although no transcript can be accurate, transcriptions are an important site for linguistic intervention. Transcriptions crack open ontologies. You could say that, in a sense, the very goal of making a transcription in the first place is to make an argument about what cannot be transcribed. Nothing survives transcription, and though we may be “lost” (as Jordan Magnuson fears), at least we’re all lost together in an flowering forest of collaborative interpretation.

Johanna Drucker writes: “The celebration of transparency, in which physicality and materiality are wished away, is a pernicious practice rooted in the worst sort of denial or denigration of our embodied condition” (Drucker 7–8). What I’ve been trying to do in this talk is to convince you that transcription isn’t a “technology” that can be perfected—no matter how hard you try, there will always be an aspect of a text that confounds your abstraction. I have been citing some extreme examples of especially borderline texts in this talk. But I hope I’ve spurred you to look at every text as though its transcription might be an edge case.

I want to close by quoting Renée Gladman, from a passage in the introduction of her book Prose Architectures:

There are dimensions to language that are very difficult to describe with language, and yet it is only in language—in trying to move through it—that one has the privilege of experiencing these dimensions. Language has an energy that eludes verbal expression; this is a reflective energy, language dreaming of itself. I encounter these energies in the space between words, between sentences, in the crossing of passages, through the hum of thinking or imagining that shapes the language I’m reading or writing. The dream is often not the text you’re reading but comes from some other part of the page, some part of the text that is not quite visible. (Gladman)

Gladman cuts to the heart here of the problem I have with the philosophy of large language model maximalists. It’s not just that traces of language in the form of transcription (no matter how large the corpus) are insufficient to tell us what the world is like; it’s that traces of language are insufficient even to tell us what language is like. In fact, that knowledge cannot enter into a statistical model, precisely because it is not data but “the space between words […] some part of the text that is not quite visible.”

Works cited

Bucholtz, Mary. “The Politics of Transcription.” Journal of Pragmatics, vol. 32, no. 10, 2000, pp. 1439–65.
Cage, John, and Richard Kostelanetz. “His Own Music.” Perspectives of New Music, vol. 25, no. 1/2, 1987, pp. 88–106. JSTOR, https://www.jstor.org/stable/833093.
Chang, Shuo-yiin, et al. Turn-Taking Prediction for Natural Conversational Speech. arXiv:2208.13321, arXiv, 28 Aug. 2022. arXiv.org, https://doi.org/10.48550/arXiv.2208.13321.
Dickinson, Emily. Poems by Emily Dickinson. Roberts Brothers, 1890.
Drucker, Johanna. “Entity to Event: From Literal, Mechanistic Materiality to Probabilistic Materiality.” Parallax, vol. 15, no. 4, Nov. 2009, pp. 7–17. Taylor and Francis+NEJM, https://doi.org/10.1080/13534640903208834.
Dworkin, Craig. No Medium. The MIT Press, 2013. DOI.org (Crossref), https://doi.org/10.7551/mitpress/9653.001.0001.
Gladman, Renee. Prose Architectures. First edition, Wave Books, 2017.
Han, Seungju, et al. CHAMPAGNE: Learning Real-World Conversation from Large-Scale Web Videos. arXiv:2303.09713, arXiv, 16 Mar. 2023. arXiv.org, https://doi.org/10.48550/arXiv.2303.09713.
Howe, Susan. The Birth-Mark: Unsettling the Wilderness in American Literary History. Wesleyan University Press : University Press of New England, 1993.
Jefferson, Gail. “Glossary of Transcript Symbols with an Introduction.” Pragmatics & Beyond New Series, edited by Gene H. Lerner, vol. 125, John Benjamins Publishing Company, 2004, pp. 13–31. DOI.org (Crossref), https://doi.org/10.1075/pbns.125.02jef.
---. “Preliminary Notes on a Possible Metric Which Provides for a ‘Standard Maximum’ Silence of Approximately One Second in Conversation.” Conversation: An Interdisciplinary Perspective, Multilingual Matters, 1989, pp. 166–96.
Magnuson, Jordan. Game Poems: Videogame Design as Lyric Practice. Amherst College Press, 2023. ACLS Humanities EBook, https://doi.org/10.3998/mpub.12758539.
McLaughlin, Margaret L., and Michael J. Cody. “Awkward Silences: Behavioral Antecedents and Consequences of the Conversational Lapse.” Human Communication Research, vol. 8, no. 4, 1982, pp. 299–316. Wiley Online Library, https://doi.org/10.1111/j.1468-2958.1982.tb00669.x.
Mondada, Lorenza. “Commentary: Transcript Variations and the Indexicality of Transcribing Practices.” Discourse Studies, vol. 9, no. 6, Dec. 2007, pp. 809–21. DOI.org (Crossref), https://doi.org/10.1177/1461445607082581.
Mushin, Ilana, and Rod Gardner. “Silence Is Talk: Conversational Silence in Australian Aboriginal Talk-in-Interaction.” Journal of Pragmatics, vol. 41, no. 10, Oct. 2009, pp. 2033–52. ScienceDirect, https://doi.org/10.1016/j.pragma.2008.11.004.
Nishikawa, Kinohi. “Black Arts of Erasure.” ASAP/Journal, vol. 7, no. 2, 2022, pp. 296–303. Project MUSE, https://doi.org/10.1353/asa.2022.0027.
Pipkin, Everest. “A Long History of Generated Poetics: Cutups from Dickinson to Melitzah.” Medium, 20 Sept. 2016, https://everestpipkin.medium.com/a-long-history-of-generated-poetics-cutups-from-dickinson-to-melitzah-fce498083233.
Wooffitt, Robin. The Language of Mediums and Psychics: The Social Organization of Everyday Miracles. Ashgate, 2006.
Wylder, Edith. “Emily Dickinson’s Punctuation: The Controversy Revisited.” American Literary Realism, vol. 36, no. 3, 2004, pp. 206–24. JSTOR, https://www.jstor.org/stable/27747139.
Żelasko, Piotr, et al. “What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition.” Transactions of the Association for Computational Linguistics, vol. 9, Oct. 2021, pp. 1163–79. Silverchair, https://doi.org/10.1162/tacl_a_00420.