SsciBERT: A Pre-trained Language Model for Social Science Texts

06/09/2022
by   Si Shen, et al.
0

The academic literature of social sciences is the literature that records human civilization and studies human social problems. With the large-scale growth of this literature, ways to quickly find existing research on relevant issues have become an urgent demand for researchers. Previous studies, such as SciBERT, have shown that pre-training using domain-specific texts can improve the performance of natural language processing tasks in those fields. However, there is no pre-trained language model for social sciences, so this paper proposes a pre-trained model on many abstracts published in the Social Science Citation Index (SSCI) journals. The models, which are available on Github (https://github.com/S-T-Full-Text-Knowledge-Mining/SSCI-BERT), show excellent performance on discipline classification and abstract structure-function recognition tasks with the social sciences literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/14/2022

PERT: Pre-training BERT with Permuted Language Model

Pre-trained Language Models (PLMs) have been widely used in various natu...
research
07/08/2015

Generating Navigable Semantic Maps from Social Sciences Corpora

It is now commonplace to observe that we are facing a deluge of online i...
research
03/25/2023

Informed Machine Learning, Centrality, CNN, Relevant Document Detection, Repatriation of Indigenous Human Remains

Among the pressing issues facing Australian and other First Nations peop...
research
03/23/2023

SwissBERT: The Multilingual Language Model for Switzerland

We present SwissBERT, a masked language model created specifically for p...
research
04/02/2020

Mapping Three Decades of Intellectual Change in Academia

Research on the development of science has focused on the creation of mu...
research
01/27/2023

Context Matters: A Strategy to Pre-train Language Model for Science Education

This study aims at improving the performance of scoring student response...
research
08/25/2020

Conceptualized Representation Learning for Chinese Biomedical Text Mining

Biomedical text mining is becoming increasingly important as the number ...

Please sign up or login with your details

Forgot password? Click here to reset