The ACL OCL Corpus: advancing Open science in Computational Linguistics

05/24/2023
by   Shaurya Rohatgi, et al.
0

We present a scholarly corpus from the ACL Anthology to assist Open scientific research in the Computational Linguistics domain, named as ACL OCL. Compared with previous ARC and AAN versions, ACL OCL includes structured full-texts with logical sections, references to figures, and links to a large knowledge resource (semantic scholar). ACL OCL contains 74k scientific papers, together with 210k figures extracted up to September 2022. To observe the development in the computational linguistics domain, we detect the topics of all OCL papers with a supervised neural model. We observe ”Syntax: Tagging, Chunking and Parsing” topic is significantly shrinking and ”Natural Language Generation” is resurging. Our dataset is open and available to download from HuggingFace in https://huggingface.co/datasets/ACL-OCL/ACL-OCL-Corpus.

READ FULL TEXT
research
08/03/2020

Elsevier OA CC-By Corpus

We introduce the Elsevier OA CC-BY corpus. This is the first open corpus...
research
04/18/2017

Analysis of Computational Science Papers from ICCS 2001-2016 using Topic Modeling and Graph Theory

This paper presents results of topic modeling and network models of topi...
research
01/25/2021

TDMSci: A Specialized Corpus for Scientific Literature Entity Tagging of Tasks Datasets and Metrics

Tasks, Datasets and Evaluation Metrics are important concepts for unders...
research
10/05/2021

Reddit-TUDFE: practical tool to explore Reddit usability in data science and knowledge processing

This contribution argues that Reddit, as a massive, categorized, open-ac...
research
10/16/2019

NLPExplorer: Exploring the Universe of NLP Papers

Understanding the current research trends, problems, and their innovativ...
research
11/16/2022

Galactica: A Large Language Model for Science

Information overload is a major obstacle to scientific progress. The exp...
research
06/27/2018

hep-th

We apply techniques in natural language processing, computational lingui...

Please sign up or login with your details

Forgot password? Click here to reset