Do Language Models Plagiarize?
Past literature has illustrated that language models do not fully understand the context and sensitivity of text and can sometimes memorize phrases or sentences present in their training sets. In this paper, we investigate whether they not only memorize but also plagiarize training samples when generating artificial texts. Our findings support that they, especially GPT-2, reuse particular pieces of texts from the training corpus with or without obfuscation. We have four main results: 1) language models with more capacity plagiarize more; 2) fine-tuned language models demonstrate differing patterns of plagiarism based on characteristics of auxiliary data; 3) sampling from truncated language modeling distributions tends to heighten the degree of plagiarism as opposed to temperature sampling, and 4) plagiarism in language models can have serious privacy consequences. Overall, our work implies that future research on neural language models should take precautions to avoid models plagiarizing their training datasets.
READ FULL TEXT