Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP

04/17/2021
by   Josh Rozner, et al.
0

Cryptic crosswords, the dominant English-language crossword variety in the United Kingdom, can be solved by expert humans using flexible, creative intelligence and knowledge of language. Cryptic clues read like fluent natural language, but they are adversarially composed of two parts: a definition and a wordplay cipher requiring sub-word or character-level manipulations. As such, they are a promising target for evaluating and advancing NLP systems that seek to process language in more creative, human-like ways. We present a dataset of cryptic crossword clues from a major newspaper that can be used as a benchmark and train a sequence-to-sequence model to solve them. We also develop related benchmarks that can guide development of approaches to this challenging task. We show that performance can be substantially improved using a novel curriculum learning approach in which the model is pre-trained on related tasks involving, e.g, unscrambling words, before it is trained to solve cryptics. However, even this curricular approach does not generalize to novel clue types in the way that humans can, and so cryptic crosswords remain a challenge for NLP systems and a potential source of future innovation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

Learning Better Universal Representations from Pre-trained Contextualized Language Models

Pre-trained contextualized language models such as BERT have shown great...
research
10/06/2021

BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models

Pre-trained Natural Language Processing (NLP) models can be easily adapt...
research
03/07/2023

A Challenging Benchmark for Low-Resource Learning

With promising yet saturated results in high-resource settings, low-reso...
research
09/19/2018

String Transduction with Target Language Models and Insertion Handling

Many character-level tasks can be framed as sequence-to-sequence transdu...
research
04/08/2021

AlephBERT:A Hebrew Large Pre-Trained Language Model to Start-off your Hebrew NLP Application With

Large Pre-trained Language Models (PLMs) have become ubiquitous in the d...
research
08/04/2020

Word meaning in minds and machines

Machines show an increasingly broad set of linguistic competencies, than...
research
11/16/2020

Learning from Task Descriptions

Typically, machine learning systems solve new tasks by training on thous...

Please sign up or login with your details

Forgot password? Click here to reset