StyleKQC: A Style-Variant Paraphrase Corpus for Korean Questions and Commands

03/24/2021 ∙ by Won Ik Cho, et al. ∙ IKI Seoul National University 0

Paraphrasing is often performed with less concern for controlled style conversion. Especially for questions and commands, style-variant paraphrasing can be crucial in tone and manner, which also matters with industrial applications such as dialog system. In this paper, we attack this issue with a corpus construction scheme that simultaneously considers the core content and style of directives, namely intent and formality, for the Korean language. Utilizing manually generated natural language queries on six daily topics, we expand the corpus to formal and informal sentences by human rewriting and transferring. We verify the validity and industrial applicability of our approach by checking the adequate classification and inference performance that fit with the fine-tuning approaches, at the same time proposing a supervised formality transfer task.



There are no comments yet.


page 1

page 2

page 3

page 4

Code Repositories

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Paraphrasing, the act of using different sentences with the same meaning Bhagat and Hovy (2013), is strongly related to the text style conversion or transfer Yamshchikov et al. (2020). While prior studies often modify sentiment or offensiveness Logeswaran et al. (2018); dos Santos et al. (2018), in view of paraphrase, it should be well checked whether the core content of the sentence is maintained during the conversion process. If the sentence meaning stays the same while changing politeness or formality Rao and Tetreault (2018), we can call it paraphrasing or rewriting. Such styles can be represented in diverse ways across genre, domain, and language Jhamtani et al. (2017); Fu et al. (2018); Yang et al. (2019).

We deal with the scheme of constructing a corpus of style-variant paraphrases for directive sentences such as questions and commands, targeting the Korean language where politeness (suffix) and honorifics play a significant role in conversation Strauss and Eun (2005). Here, we consider topic and speech act as attributes constituting the directive sentence Cho et al. (2020) and construct a formal style paraphrase set using the natural language queries displaying each topic and speech act. Finally, style-variant paraphrase pairs are obtained by manual conversion to informal sentences in consideration of content preservation, and is to be released publicly as the first open text style transfer dataset in Korean. Our contributions are the following:

  • [noitemsep]

  • We present a corpus construction scheme capable of performing multiple tasks while enabling parallel sentence style transfer.

  • We release a Korean corpus where sentence formality style is well defined, regarding the daily used questions and commands111

2 Related Work

In general, sentence style222In this paper, we view ‘formality’ in Korean as a style, while interchangeably using ‘conversion’ and ‘transfer’. is handled regarding tone and manner in writing, though with a subtle difference Brooks (2020). However, previous researches about content-preserving style transfer Logeswaran et al. (2018); Tian et al. (2018)

do not seem to be only on tone in that the change in sentiment may influence the core speaker intent. Furthermore, most approaches were from the perspective of unsupervised learning

dos Santos et al. (2018); Bao et al. (2019)

, with less explored fields of parallel style-variant corpus for supervised learning, which might provide robust guidance for the generative pre-trained models nowadays

Radford et al. (2019).

This trend was similarly revealed in previous studies on Korean. Since the early approaches follow the studies in English and other languages, sentiment or stance-based style transfer have prevalently been suggested Lee et al. (2019); Choi and Na (2019)333Most of the work are not in an internationally readable format; thus, we note here the methods used in the papers.. In Hong et al. (2018), the transfer regarding politeness suffix of the sentence enders was considered at the same time maintaining the sentence meaning, mainly regarding ‘hay-yo’ and ‘hap-syo

’ enders which differ in the degree of formality. However, it dealt only with the syntactic change, not the modification in the lexicon, adverbs, or tone and manner of the speech, which are all considered influential for the honorific system

Strauss and Eun (2005). In this regard, we thought that formality style transfer should be well defined along with content preservation. Furthermore, there is no open dataset for Korean style transfer that can be utilized for research and commercial purposes. We aim to resolve the above issues by proposing a straightforward and effective building scheme.

3 Proposed Scheme

We construct a corpus of Korean directives, namely questions and commands, where the question consists of an alternative question (Alt. Q) or wh-question (wh- Q), and the command consists of prohibition (PH) and requirement (REQ), following Cho et al. (2020)

. In other words, we target four types of speech acts and assume sentences that can be uttered to humans or artificial intelligent (AI) agents. There are six topics involved in this:

messenger, calendar, weather and news, smart home, shopping, and entertainment, which come from a recent survey on customers usage Lee et al. (2020). 12 workers from different backgrounds were recruited. We required specifying two likes and one dislike on the topic, and these preferences were taken into account when creating a total of 6 subgroups with two people each.

We created a construction scheme that goes through the following three steps to check its reliability while generating utterances of 5,000 per topic and 7,500 per speech act.

  1. [noitemsep]

  2. Writing natural language queries

  3. Rewriting paraphrased queries in formal tone

  4. Converting the formal sentences to informal

Query generation

First, query generation is a process in which participants directly suggest the core content of directives which are to be rewritten in a formal style. In this process, participants were asked to write a natural language query for each of the given two speech acts on the assigned topic444These were given by the process managers in Cho et al. (2020), but here we let them be created by the workers to make the contents more diverse and to benefit from the preferences. Since the query structure differs by speech act type as in Cho et al. (2020), the created queries did not overlap across the workers. The queries were checked for suitability, to avoid personally identifiable stuff or the ones that can cause social harm. A total of 125 queries were generated for each (topic, act) pair. The example of queries per some (topic, act) is shown below. All the queries are generated in Korean, but described here in English for demonstrative purpose.

  • [noitemsep]

  • (Shopping, Alt. Q) The one that has better A/S between Samsung and Apple

  • (Entertainment, Wh-Q) The TV channel number where the news is on at 8:00 p.m.

  • (Messenger, PH) Not to turn on WeChat automatic update

  • (Smart home, REQ) To recharge the wireless vacuum cleaner in the multi-room

No particular style was considered in generation, but the workers were asked to make up the expressions that fit with colloquial context and daily life. Also, knowledge-intensive questions or queries with multiple contents were asked for modification.

Figure 1: An example of query generation-formal sentence writing-informal transferring, along with the gloss and translation (PRT particle, NMN nominalizer, ACC accusative, FUT futuristic, DEC declarative, POL polite, IMP imperative). Though not reflected in the English translation, the transferring preserves the overall structure of the formal sentence as well as the core content.

Writing formal paraphrases

Next is a process in which the workers of the subgroups exchange queries generated by each other and rewrite them into formal style sentences555In this process, the workers check the validity of the query created by each other, that the incompleteness of the queries that the moderator could have omitted can be pointed out.. We primarily asked for the formal style because there are more diverse expressions for formal utterances in the Korean language regarding indirect speech and honorifics Byon (2006), so that it is easier for paraphrasing compared to informal ones that might not come to the worker’s mind at the first place. It was required that the utterances fit with the conversation with the senior or elderly addressees rather than the friends or juniors.

Rewriting was required for a total of 5 sentences. To make the paraphrases as diverse as possible, the asking strategies in Byon (2006) and Cho (2008) were requested. We display some excerpts:

  • [noitemsep]

  • Softening the commands to requests

  • Mentioning the addressee’s responsibility

  • Lessening the addressee’s burden

  • Asking the availability of the addressee

Some of these characteristics are shared across the culture Brown et al. (1987). It may also be exhibited similar in the East Asian society Gu (1990) and within similar syntax such as Japanese Okamoto (1999); Fukada and Asato (2004). However, we faced language-specific considerations regarding the functional and lexical expressions and asked the workers to reflect them in the construction. Simultaneously, to fit with the naturalness within colloquial context, written-style or outdated phrases/words were eschewed.

Converting to informal style

The final process is modifying directive sentences written in formal style into informal sentences. Here, the workers convert the other person’s formal sentences, created from the original query they had generated, checking the typos and misunderstandings once again. ‘Informality’ defined here is slightly different from being rude or impolite, but instead means the conversation moves towards a more comfortable and personal relationship. Rao and Tetreault (2018).

In this process, we asked the workers to maintain the overall sentence structure, of which the diversity was already obtained owing to policies in writing formal sentences. With this, we could prevent the potential overlap between the converted sentences and also guarantee ‘parallel’ data. This can be more effective in the Korean language where the indirectness is often distinguished from formality; for instance, a cautious request to a younger brother can be informal but indirect.

Style conversion was performed in various aspects such as change in sentence enders, honorifics, and lexicons (such as nation to country). The workers were encouraged to insert or delete some phrases depending on the naturalness regarding the content, and to perform at least two word-level modifications. The detailed guideline666 for the whole process was provided to the workers with the example query-sentence tuples, and we exhibit one of them (Figure 1).


The corpus was refined by three native speakers with corpus construction experience for Korean directive sentences. In this process, typos, awkward sentences, and paraphrases that are not sufficiently diverse were inspected, and the reviews were reflected by the moderators.

4 Experiment

4.1 Task Setting

Through the experiment, we display that the proposed construction scheme provides a corpus that can simultaneously enable multiple tasks, which can bring advantage from a practical viewpoint.

  • [noitemsep]

  • Topic classification

  • Speech act classification

  • Paraphrase detection

  • Sentence style transfer

4.2 Implementation

For each of the total 24 [topic, act] chunks where we have 125 queries each, we set aside 80% (100 queries) for training, 4% (5 queries) for validation, and 16% (20 queries) for the test. From the whole dataset of volume 30,000, the training set contains 24,000 sentences and 1,200/4,800 for dev/test each. The queries were chosen randomly, and all the sets have an equal rate of topic and speech act ratio.

Topic (TOPIC) and speech act (ACT) classification are intuitively formulated. There are 5,000 utterances for each topic and 7,500 utterances for each speech act, where six topics and four speech acts are set for the labels.

Paraphrase detection (PARA) requires a sentence pair. In Cho et al. (2020), the sentence similarity was defined in 5-fold, checking if topic or speech act overlaps between the two input sentences, with the highest similarity if the queries are identical (the paraphrases). The paraphrase detection task was derived from here by formulating the multi-class problem into a binary task.

Finally, we checked whether sentence style transfer (STYLE) works using the pairs within; 12,000 pairs for training, 600 for validation and 2,400 for the test. The training was done in the way of converting the formal sentences to informal one.

Both sentence classification and paraphrase detection tasks were implemented based on a BERT-based Devlin et al. (2019) KcBERT777 Lee (2020), and for sentence style transfer, KoGPT2888 that bases on GPT2 Radford et al. (2019) was adopted. F1 (macro) and accuracy were used for the classification tasks, and for style transfer, we checked character edit distance (CED). The accuracy for style transfer () denotes the precision obtained with the model learned upon the train set Pang (2019). Experimental settings are provided as supplementary.

Input Sentence Sentence Pair Sentence
Class # 6 4 2 -
Volume 30,000 30,000 270,000 15,000
F1 Score 92.68 97.75 99.93 -
Accuracy 92.83 97.75 99.93 99.58
CED - - - 0.451
Table 1: Experiment results on four subtasks.

4.3 Results

In classification and inference, we have the evaluation results that show consistency between the train and test dataset (Table 1). Considering that the queries in each set are distinguished from each other, we claim that our dataset displays the extensibility to wider world problems, also given the comprehensive coverage of topics and acts that are of interest in usual conversation and smart speaker dialogues. Though the baseline score is quite high for ACT and PARA, it does not harm one of our goals to provide a solid scheme for corpus construction that suffices practical, real-world applicability.

On STYLE, we adopted CED since formality as our ‘style’ more regards the change in suffix and some lexicons rather than the whole word order and phrase usage999On using other objective measures, the morpheme-level tokenization is not yet unified for Korean sentences, to make evaluation harder.

. Nonetheless, we found the transfer task still challenging in view of the objective measure. Instead, we observed the practical validity using the style classifier learned upon train and valid set. We qualitatively checked that the seq2seq

Sutskever et al. (2014) with pre-trained generative model guarantees the intended style transfer, and the detail is to be provided in Appendix A.

4.4 Discussion

We have some notes on the validity of the created dataset. Primarily, though the dataset is first suggested open corpus for Korean style transfer, the granularity of the style difference within the pair is not provided here as in Rao and Tetreault (2018). Also, since our dataset provides the style transfer that maintains the overall sentence structure, some sentence pairs show minor differences, which is sufficient for spoken language processing but less robust to digitized online texts. Finally, since the formality conversion regards morpho-syntactic and lexical changes rather than the paraphrasing done in writing the formal sentences, the diversity of expression regarding the style is limited to the sentence formats that are not awkward to utter.

Despite the limitations, we want to emphasize that our approach can suggest a reliable and efficient scheme for the service providers or task managers aiming at a particular style transfer for various types of sentences. For instance, if one replaces the queries with some structured query language (SQL) or canonical forms of statements and use ‘rudeness’ or ‘twitter-likeness’ as a style, the parallel dataset can be created in the same way, though with a slightly different guideline. This kind of pair generation was done with rule or back translation in Rao and Tetreault (2018), but we believe that human-aided construction is more reliable and the resulting shortage of data can be covered with the pre-trained models for the spoken language.

5 Conclusion

In this paper, we construct and disclose the first style-variant Korean paraphrase corpus. Topic, speech act, and paraphrase are simultaneously considered in evaluating the final corpus, where the consistent composition is assumed to be guaranteed by the evaluation results. The entire guideline is currently specific to the formality transfer in Korean, but can be utilized in making up other parallel style transfer corpus with an extended pool of topics, speech acts, queries, and style. All the resources are available online101010

Ethical Considerations

In the corpus construction procedure which bases upon the documented approval of the workers, adequate compensation was paid to each of them, in all the process of query generation, writing formal sentences, and transferring them to the informal one. The participants, recruited from the social media and web, were familiar with the smart speakers and some of them had experience in corpus construction processes. For 12 participants, 250 WON ($0.22) was provided in writing each query and 200 WON ($0.18) for making up the sentences. Thus, each participant was paid 600,000 WON ($540) to make up 250 queries and write 2,500 sentences.

Our resource is free from the license issue since all the materials were created according to the guideline (a kind of template) and checked for post-processing. The outcome of our project does not contain any personally identifiable information, nor the contents that can induce social harm.


This research was supported by the Technology Innovation Program (10076583, Development of free-running speech recognition technologies for embedded robot system) funded By the Ministry of Trade, Industry & Energy (MOTIE, Korea). Also, the corpus construction was possible thanks to the help of twelve passionate participants, namely Kyung Seo Ki, Dongho Lee, Yoon Kyung Lee, Hee Young Park, Yulhee Kim, Seyoung Park, Jiwon An, Jeonghwa Cho, Kihyo Park, Kyuhwan Lee, Soomin Lee, and Minhwa Chung.


  • Bao et al. (2019) Yu Bao, Hao Zhou, Shujian Huang, Lei Li, Lili Mou, Olga Vechtomova, Xinyu Dai, and Jiajun Chen. 2019. Generating sentences from disentangled syntactic and semantic spaces. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6008–6019.
  • Bhagat and Hovy (2013) Rahul Bhagat and Eduard Hovy. 2013. What is a paraphrase? Computational Linguistics, 39(3):463–472.
  • Brooks (2020) Carellin Brooks. 2020. Building Blocks of Academic Writing. BCcampus.
  • Brown et al. (1987) Penelope Brown, Stephen C Levinson, and Stephen C Levinson. 1987. Politeness: Some universals in language usage, volume 4. Cambridge University Press.
  • Byon (2006) Andrew Sangpil Byon. 2006. The role of linguistic indirectness and honorifics in achieving linguistic politeness in Korean requests. Journal of Politeness Research, 2(2):247–276.
  • Cho et al. (2020) Won Ik Cho, Jong In Kim, Young Ki Moon, and Nam Soo Kim. 2020. Discourse component to sentence (DC2S): An efficient human-aided construction of paraphrase and sentence similarity dataset. In Proceedings of The 12th Language Resources and Evaluation Conference, pages 6819–6826.
  • Cho (2008) Yongkil Cho. 2008. Strategic use of Korean honorifics functions of ‘partner-deference sangdae-nopim’. Dialogue and Rhetoric, 2:155.
  • Choi and Na (2019) Hyeong-Jun Choi and Seung-Hoon Na. 2019.

    Delete and generate: Korean style transfer based on deleting and generating word n-grams.

    In Annual Conference on Human and Language Technology, pages 400–403. Human and Language Technology.
  • Devlin et al. (2019) Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
  • dos Santos et al. (2018) Cicero dos Santos, Igor Melnyk, and Inkit Padhi. 2018. Fighting offensive language on social media with unsupervised text style transfer. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 189–194.
  • Fu et al. (2018) Zhenxin Fu, Xiaoye Tan, Nanyun Peng, Dongyan Zhao, and Rui Yan. 2018. Style transfer in text: Exploration and evaluation. In AAAI, pages 663–670.
  • Fukada and Asato (2004) Atsushi Fukada and Noriko Asato. 2004. Universal politeness theory: application to the use of Japanese honorifics. Journal of Pragmatics, 36(11):1991–2002.
  • Gu (1990) Yueguo Gu. 1990. Politeness phenomena in modern Chinese. Journal of Pragmatics, 14(2):237–257.
  • Hong et al. (2018) Taesuk Hong, Guanghao Xu, Hwijeen Ahn, Sangwoo Kang, and Jungyun Seo. 2018. Korean text style transfer using attention-based sequence-to-sequence model. In Annual Conference on Human and Language Technology, pages 567–569. Human and Language Technology.
  • Jhamtani et al. (2017) Harsh Jhamtani, Varun Gangal, Eduard Hovy, and Eric Nyberg. 2017. Shakespearizing modern language using copy-enriched sequence to sequence models. In Proceedings of the Workshop on Stylistic Variation, pages 10–19.
  • Lee et al. (2019) Joosung Lee, Yeontaek Oh, hyunjin Byun, and Kyungkoo Min. 2019. Controlled Korean style transfer using BERT. In Annual Conference on Human and Language Technology, pages 395–399. Human and Language Technology.
  • Lee (2020) Junbum Lee. 2020. Kcbert: Korean comments bert. In Annual Conference on Human and Language Technology. Human and Language Technology.
  • Lee et al. (2020) Jung Hyeon Lee, Hyung Joo Seon, and Hong Joo Lee. 2020. Positioning of smart speakers by applying text mining to consumer reviews: Focusing on artificial intelligence factors. Knowledge Management Research, 21(1):197–210.
  • Logeswaran et al. (2018) Lajanugen Logeswaran, Honglak Lee, and Samy Bengio. 2018.

    Content preserving text generation with attribute controls.

    In Advances in Neural Information Processing Systems, pages 5103–5113.
  • Okamoto (1999) Shigeko Okamoto. 1999. Situated politeness: Manipulating honorific and non-honorific expressions in Japanese conversations. Pragmatics, 9(1):51–74.
  • Pang (2019) Richard Yuanzhe Pang. 2019. The daunting task of real-world textual style transfer auto-evaluation. arXiv preprint arXiv:1910.03747.
  • Radford et al. (2019) Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  • Rao and Tetreault (2018) Sudha Rao and Joel Tetreault. 2018. Dear sir or madam, may i introduce the GYAFC dataset: Corpus, benchmarks and metrics for formality style transfer. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 129–140.
  • Strauss and Eun (2005) Susan Strauss and Jong Oh Eun. 2005. Indexicality and honorific speech level choice in Korean. Linguistics, 43(3):611–651.
  • Sutskever et al. (2014) Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014.

    Sequence to sequence learning with neural networks.

    In Advances in Neural Information Processing Systems, pages 3104–3112.
  • Tian et al. (2018) Youzhi Tian, Zhiting Hu, and Zhou Yu. 2018. Structured content preservation for unsupervised text style transfer. arXiv preprint arXiv:1810.06526.
  • Yamshchikov et al. (2020) Ivan Yamshchikov, Viacheslav Shibaev, Nikolay Khlebnikov, and Alexey Tikhonov. 2020. Style-transfer and paraphrase: Looking for a sensible semantic similarity metric. arXiv preprint arXiv:2004.05001.
  • Yang et al. (2019) Zhichao Yang, Pengshan Cai, Yansong Feng, Fei Li, Weijiang Feng, ElenaSuet-Ying Chiu, and Hong Yu. 2019. Generating classical chinese poems from vernacular Chinese. In

    Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

    , volume 2019, page 6155. NIH Public Access.