BERTese: Learning to Speak to BERT

by   Adi Haviv, et al.

Large pre-trained language models have been shown to encode large amounts of world and commonsense knowledge in their parameters, leading to substantial interest in methods for extracting that knowledge. In past work, knowledge was extracted by taking manually-authored queries and gathering paraphrases for them using a separate pipeline. In this work, we propose a method for automatically rewriting queries into "BERTese", a paraphrase query that is directly optimized towards better knowledge extraction. To encourage meaningful rewrites, we add auxiliary loss functions that encourage the query to correspond to actual language tokens. We empirically show our approach outperforms competing baselines, obviating the need for complex pipelines. Moreover, BERTese provides some insight into the type of language that helps language models perform knowledge extraction.


page 1

page 2

page 3

page 4


CoCoLM: COmplex COmmonsense Enhanced Language Model

Large-scale pre-trained language models have demonstrated strong knowled...

How to Query Language Models?

Large pre-trained language models (LMs) are capable of not only recoveri...

Intrinsic Knowledge Evaluation on Chinese Language Models

Recent NLP tasks have benefited a lot from pre-trained language models (...

P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts

Recent work (e.g. LAMA (Petroni et al., 2019)) has found that the qualit...

Explaining Question Answering Models through Text Generation

Large pre-trained language models (LMs) have been shown to perform surpr...

BertNet: Harvesting Knowledge Graphs from Pretrained Language Models

Symbolic knowledge graphs (KGs) have been constructed either by expensiv...

Modern Baselines for SPARQL Semantic Parsing

In this work, we focus on the task of generating SPARQL queries from nat...