QuesNet: A Unified Representation for Heterogeneous Test Questions

05/27/2019
by   Yu Yin, et al.
0

Understanding learning materials (e.g. test questions) is a crucial issue in online learning systems, which can promote many applications in education domain. Unfortunately, many supervised approaches suffer from the problem of scarce human labeled data, whereas abundant unlabeled resources are highly underutilized. To alleviate this problem, an effective solution is to use pre-trained representations for question understanding. However, existing pre-training methods in NLP area are infeasible to learn test question representations due to several domain-specific characteristics in education. First, questions usually comprise of heterogeneous data including content text, images and side information. Second, there exists both basic linguistic information as well as domain logic and knowledge. To this end, in this paper, we propose a novel pre-training method, namely QuesNet, for comprehensively learning question representations. Specifically, we first design a unified framework to aggregate question information with its heterogeneous inputs into a comprehensive vector. Then we propose a two-level hierarchical pre-training algorithm to learn better understanding of test questions in an unsupervised way. Here, a novel holed language model objective is developed to extract low-level linguistic features, and a domain-oriented objective is proposed to learn high-level logic and knowledge. Moreover, we show that QuesNet has good capability of being fine-tuned in many question-based tasks. We conduct extensive experiments on large-scale real-world question data, where the experimental results clearly demonstrate the effectiveness of QuesNet for question understanding as well as its superior applicability.

READ FULL TEXT

page 3

page 5

research
01/18/2023

Towards a Holistic Understanding of Mathematical Questions with Contrastive Pre-training

Understanding mathematical questions effectively is a crucial task, whic...
research
03/09/2023

TQ-Net: Mixed Contrastive Representation Learning For Heterogeneous Test Questions

Recently, more and more people study online for the convenience of acces...
research
07/12/2021

MOOCRep: A Unified Pre-trained Embedding of MOOC Entities

Many machine learning models have been built to tackle information overl...
research
06/13/2022

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding

This paper aims to advance the mathematical intelligence of machines by ...
research
11/16/2021

An Empirical Study of Finding Similar Exercises

Education artificial intelligence aims to profit tasks in the education ...
research
01/26/2019

Language Model Pre-training for Hierarchical Document Representations

Hierarchical neural architectures are often used to capture long-distanc...
research
01/21/2023

Adapting a Language Model While Preserving its General Knowledge

Domain-adaptive pre-training (or DA-training for short), also known as p...

Please sign up or login with your details

Forgot password? Click here to reset