SASICM A Multi-Task Benchmark For Subtext Recognition
Subtext is a kind of deep semantics which can be acquired after one or more rounds of expression transformation. As a popular way of expressing one's intentions, it is well worth studying. In this paper, we try to make computers understand whether there is a subtext by means of machine learning. We build a Chinese dataset whose source data comes from the popular social media (e.g. Weibo, Netease Music, Zhihu, and Bilibili). In addition, we also build a baseline model called SASICM to deal with subtext recognition. The F1 score of SASICMg, whose pretrained model is GloVe, is as high as 64.37 higher than that of BERT based model, 12.7 methods on average, including support vector machine, logistic regression classifier, maximum entropy classifier, naive bayes classifier and decision tree and 2.39 BTM. The F1 score of SASICMBERT, whose pretrained model is BERT, is 65.12 which is 0.75 SASICMBERT are 71.16 other methods which are mentioned before.
READ FULL TEXT