Tracing a Loose Wordhood for Chinese Input Method Engine

12/12/2017
by   Xihu Zhang, et al.
0

Chinese input methods are used to convert pinyin sequence or other Latin encoding systems into Chinese character sentences. For more effective pinyin-to-character conversion, typical Input Method Engines (IMEs) rely on a predefined vocabulary that demands manually maintenance on schedule. For the purpose of removing the inconvenient vocabulary setting, this work focuses on automatic wordhood acquisition by fully considering that Chinese inputting is a free human-computer interaction procedure. Instead of strictly defining words, a loose word likelihood is introduced for measuring how likely a character sequence can be a user-recognized word with respect to using IME. Then an online algorithm is proposed to adjust the word likelihood or generate new words by comparing user true choice for inputting and the algorithm prediction. The experimental results show that the proposed solution can agilely adapt to diverse typings and demonstrate performance approaching highly-optimized IME with fixed vocabulary.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/11/2018

Neural-based Pinyin-to-Character Conversion with Adaptive Vocabulary

Pinyin-to-character (P2C) conversion is the core component of pinyin-bas...
research
09/02/2018

Chinese Pinyin Aided IME, Input What You Have Not Keystroked Yet

Chinese pinyin input method engine (IME) converts pinyin into character ...
research
11/25/2019

Korean-to-Chinese Machine Translation using Chinese Character as Pivot Clue

Korean-Chinese is a low resource language pair, but Korean and Chinese h...
research
02/27/2018

A Hybrid Word-Character Model for Abstractive Summarization

Abstractive summarization is the popular research topic nowadays. Due to...
research
03/12/2022

MarkBERT: Marking Word Boundaries Improves Chinese BERT

We present a Chinese BERT model dubbed MarkBERT that uses word informati...
research
05/07/2020

2kenize: Tying Subword Sequences for Chinese Script Conversion

Simplified Chinese to Traditional Chinese character conversion is a comm...
research
09/03/2019

A Smart Sliding Chinese Pinyin Input Method Editor on Touchscreen

This paper presents a smart sliding Chinese pinyin Input Method Editor (...

Please sign up or login with your details

Forgot password? Click here to reset