Textual Analysis for Studying Chinese Historical Documents and Literary Novels

by   Chao-Lin Liu, et al.

We analyzed historical and literary documents in Chinese to gain insights into research issues, and overview our studies which utilized four different sources of text materials in this paper. We investigated the history of concepts and transliterated words in China with the Database for the Study of Modern China Thought and Literature, which contains historical documents about China between 1830 and 1930. We also attempted to disambiguate names that were shared by multiple government officers who served between 618 and 1912 and were recorded in Chinese local gazetteers. To showcase the potentials and challenges of computer-assisted analysis of Chinese literatures, we explored some interesting yet non-trivial questions about two of the Four Great Classical Novels of China: (1) Which monsters attempted to consume the Buddhist monk Xuanzang in the Journey to the West (JTTW), which was published in the 16th century, (2) Which was the most powerful monster in JTTW, and (3) Which major role smiled the most in the Dream of the Red Chamber, which was published in the 18th century. Similar approaches can be applied to the analysis and study of modern documents, such as the newspaper articles published about the 228 incident that occurred in 1947 in Taiwan.



There are no comments yet.


page 1

page 2

page 3

page 4


Mining Local Gazetteers of Literary Chinese with CRF and Pattern based Methods for Biographical Information in Chinese History

Person names and location names are essential building blocks for identi...

HRCenterNet: An Anchorless Approach to Chinese Character Segmentation in Historical Documents

The information provided by historical documents has always been indispe...

Flexible Computing Services for Comparisons and Analyses of Classical Chinese Poetry

We collect nine corpora of representative Chinese poetry for the time sp...

Native Chinese Reader: A Dataset Towards Native-Level Chinese Machine Reading Comprehension

We present Native Chinese Reader (NCR), a new machine reading comprehens...

Mining and discovering biographical information in Difangzhi with a language-model-based approach

We present results of expanding the contents of the China Biographical D...

Topic Modeling the Hàn diăn Ancient Classics

Ancient Chinese texts present an area of enormous challenge and opportun...

Un modèle pour la représentation des connaissances temporelles dans les documents historiques

Processing and publishing the data of the historical sciences in the sem...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.