The cultural capital (Bourdieu, 2002) of poetry has rendered it fertile ground for computational approaches, making poetry generation “an established research field”, recently surveyed by Gonçalo Oliveira (2017). This field spans a spectrum between two different perspectives which we find fitting to call, machine-as-poet and user-as-poet. The present work is aligned with the latter.
Machine-as-poet approaches encode poeticity (Jakobson, 1981) in a corpus and train a system to reproduce the prosodic and stylistic structure encoded within it (Ghazvininejad et al., 2017; Lau et al., 2018). User-as-poet approaches aim to produce tools that a poet-programmer (Morris, 2012) or operator (James, 2006) might use in their creative endeavors (Uthus et al., 2019; Zhipeng et al., 2019). Both approaches capitalize on recent advances in neural text generation, yet they exclusively involve auto-regressive language models.
The contribution of this paper is twofold: First, in Section 2, we implement111https://github.com/webis-de/ML4CD-21 the Metropolis–Hastings sampler introduced by Goyal et al. (2021) and use it to explore constrained composition using a masked language model.222Our implementation uses the base version of English roBERTa (Liu et al., 2019) off the Hugging Face shelf (Wolf et al., 2020), but the ideas introduced here are agnostic to masked language model and language. Second, in Section 3, we reflect on useful vantage points through which to critically understand our approach, specifically how it might be enacting oulipian constraints (Symes, 1999; James, 2006).
2 Constrained Composition
. Such causal models can be used to generate text by defining a probabilityover sequences that can be sampled from auto-regressively. In practice, however, they prove unwieldy (Holtzman et al., 2020) and awkward to steer, rendering controllable generation a challenging open question (Weng, 2021), and out of reach for a flexible user-as-poet approach.
Instead, we turn our attention to the findings of Goyal et al. (2021), who develop a tractable sampling scheme for masked language models. Even though they do not explicitly model sequences, masked language models are interpretable as energy-based sequence models that can be sampled from using the stationary distribution
of a Metropolis–Hastings Markov Chain.
In reality, the distribution being sampled from is : a probability over all sequences of length . This constitutes the first constraint that the operator needs to set in advance and the only obligatory one. A prompting context can also be specified so that we sample from . However, unlike auto-regressive models, this prompting context need not be causal but can span any subset of tokens. We refer to the general case of a non-contiguous context as perforated prompting. More generally, we introduce the following distribution over sequences and a list of poetic constraints :
which can be read as a product of experts (Hinton, 1999), each constraining an aspect of the sequence. These constraints—be they syntactic,333e.g.: Lipogram, a constraint which forbids the use of a given letter. semantic, or prosodic444e.g.: Bouts-Rimés, a literary game where a sequence of rhymes has to be expanded into a poem.
—can be enacted in logit space on an arbitrary subset of the sequence’s tokens by setting the appropriate logits tothrough any of the following operations at every sampling step (see Figure 1 for examples):
Explicit prompting: Restricts the vocabulary of the model to a chosen token.
Implicit prompting: Restricts the vocabulary of the model to satisfy a constraint.
Having defined all the constraints, as well as a seeding sequence,555The seeding sequence can also consist of <mask> tokens. the operator lets the sampling process run—potentially infinitely—which then proceeds to enumerate all token combinations of the model’s vocabulary. This sampling process is modulated by knowledge gleaned from an immense training corpus, making certain combinations more likely than others.
The randomness inherent to sampling aligns this work with aleatory poetics, as the operator “deliberately engages with chance as a compositional principle” (James, 2012), specifically as prompts make the starting text underdetermined by design, with the final text having a “determinate final form” (ibid. 2012). Another useful angle is what theorists refer to as computational poetry, whose defining feature is a procedure or algorithm that “precedes and determines the poem” Morris (2012) and sees the poet-programmer step back and attend to information flows (ibid. 2012). Given the central role of the machine in our poetic endeavor, the latter is also aligned with conceptual poetry, as the creation modalities are at least as important as the ideas they attempt to express (Perloff, 2012a). The universe that the generated text is allowed to inhabit and explore has to be predetermined by the operator, so that “all of the planning and decisions are made beforehand and the execution is a perfunctory affair. The idea becomes a machine that makes the text” (Goldsmith, 2007).
Such programmatic principles notably modulate the work of the French OuLiPo movement, who use constraints to understand, “explore and expand the field of literature” (Poucel, 2012). In stark opposition to—and rejection of—the surrealist practice of automatic writing666This begs the facetious question, is GPT-3 (Brown et al., 2020) a surrealist? and the debilitating openness of contemporary writing, oulipian writing sees constraints as “a generative tool that enables a conflict necessary for the renewal of poetic form” (Deming, 2009). Such conflicts fuel a “productive friction between the constraining algorithm and the author’s desire for meaning” (James, 2006), allowing an artist to “maximize their options through minimizing their choice” (Symes, 1999).
Concerns about the mechanistic aspects of the written word can be traced back to Plato’s Phaedrus (Plato, 1972). The field of neural text generation can only exponentiate these anxieties. Without due consideration and reflected usage, the large language models we have come to wield so readily can only calcify our existing biases, and potentially introduce others. The stochastic parrots introduced by Bender et al. (2021) stand to cause damage, through an intertextual pastiche of all that is ugly on the internet. This intertextual view of the generated text, alluded to but not named in Bender et al. (2021), adds one final critical layer to our poetic enterprise, that of found poetry (Perloff, 2012b)
, except that the recontextualized collage of others’ words is mediated within an impenetrable latent space through the weights of a neural network.
- On the dangers of stochastic parrots: can language models be too big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, New York, NY, USA, pp. 610?623. External Links: Cited by: §4.
- The forms of capital. In Readings in Economic Sociology, pp. 280–291. External Links: Cited by: §1.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, and H. Lin (Eds.), Vol. 33, pp. 1877–1901. External Links: Cited by: §2, footnote 6.
- Constraints as Opposed to What?: A Philosophical Approach to the Values of Constrained Writing. Poetics Today 30 (4), pp. 653–668. External Links: Cited by: §3.
- Hafez: an interactive poetry generation system. In Proceedings of ACL 2017, System Demonstrations, Vancouver, Canada, pp. 43–48. External Links: Cited by: §1.
- Paragraphs on conceptual writing. Poetry Foundation. External Links: Cited by: §3.
- A survey on intelligent poetry generation: languages, features, techniques, reutilisation and evaluation. In Proceedings of the 10th International Conference on Natural Language Generation, Santiago de Compostela, Spain, pp. 11–20. External Links: Cited by: §1.
- Exposing the implicit energy networks behind masked language models via metropolis–hastings. External Links: Cited by: §1, §2.
- Products of experts. In Artificial Neural Networks, 1999. ICANN 99. Ninth International Conference on (Conf. Publ. No. 470), Vol. 1, pp. 1–6. Cited by: §2.
- The curious case of neural text degeneration. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020, External Links: Cited by: §2.
- What is poetry?. In Volume III Poetry of Grammar and Grammar of Poetry, pp. 740–750. External Links: Cited by: §1.
- Automatism, arbitrariness, and the oulipian author. French Forum 31 (2), pp. 111–125. External Links: Cited by: §1, §1, §3.
- Aleatory poetics. In The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition, R. Greene, S. Cushman, C. Cavanagh, J. Ramazani, P. Rouzer, H. Feinsod, D. Marno, and A. Slessarev (Eds.), pp. 31–34. External Links: Cited by: §3.
- Deep-speare: a joint neural model of poetic language, meter and rhyme. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 1948–1958. External Links: Cited by: §1.
- RoBERTa: a robustly optimized bert pretraining approach. External Links: Cited by: footnote 2.
- Computational poetry. In The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition, R. Greene, S. Cushman, C. Cavanagh, J. Ramazani, P. Rouzer, H. Feinsod, D. Marno, and A. Slessarev (Eds.), pp. 288. External Links: Cited by: §1, §3.
- Conceptual poetry. In The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition, R. Greene, S. Cushman, C. Cavanagh, J. Ramazani, P. Rouzer, H. Feinsod, D. Marno, and A. Slessarev (Eds.), pp. 292. External Links: Cited by: §3.
- Found poetry. In The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition, R. Greene, S. Cushman, C. Cavanagh, J. Ramazani, P. Rouzer, H. Feinsod, D. Marno, and A. Slessarev (Eds.), pp. 503–504. External Links: Cited by: §4.
- Plato: phaedrus. Cambridge University Press. External Links: Cited by: §4.
- OuLiPo. In The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition, R. Greene, S. Cushman, C. Cavanagh, J. Ramazani, P. Rouzer, H. Feinsod, D. Marno, and A. Slessarev (Eds.), pp. 987–988. External Links: Cited by: §3.
- Writing by numbers: oulipo and the creativity of constraints. Mosaic: An Interdisciplinary Critical Journal 32 (3), pp. 87–107. External Links: Cited by: §1, §3.
First steps towards collaborative poetry generation.
NeurIPS Workshop on Machine Learning for Creativity and Design 3.0, Cited by: §1.
- Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, Red Hook, NY, USA, pp. 6000?6010. External Links: Cited by: §2.
- Controllable neural text generation.. lilianweng.github.io/lil-log. External Links: Cited by: §2.
Transformers: state-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, pp. 38–45. External Links: Cited by: footnote 2.
- Jiuge: a human-machine collaborative Chinese classical poetry generation system. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, Italy, pp. 25–30. External Links: Cited by: §1.