The General Index of Software Engineering Papers

04/07/2022
by   Zeinab Abou Khalil, et al.
0

We introduce the General Index of Software Engineering Papers, a dataset of fulltext-indexed papers from the most prominent scientific venues in the field of Software Engineering. The dataset includes both complete bibliographic information and indexed ngrams (sequence of contiguous words after removal of stopwords and non-words, for a total of 577 276 382 unique n-grams in this release) with length 1 to 5 for 44 581 papers retrieved from 34 venues over the 1971-2020 period.The dataset serves use cases in the field of meta-research, allowing to introspect the output of software engineering research even when access to papers or scholarly search engines is not possible (e.g., due to contractual reasons). The dataset also contributes to making such analyses reproducible and independently verifiable, as opposed to what happens when they are conducted using 3rd-party and non-open scholarly indexing services.The dataset is available as a portable Postgres database dump and released as open data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2023

Towards an Understanding of Large Language Models in Software Engineering Tasks

Large Language Models (LLMs) have drawn widespread attention and researc...
research
12/24/2019

The Evolution of Empirical Methods in Software Engineering

Empirical methods like experimentation have become a powerful means to d...
research
02/15/2022

Social Science Theories in Software Engineering Research

As software engineering research becomes more concerned with the psychol...
research
04/13/2019

Open Science in Software Engineering

Open science describes the movement of making any research artefact avai...
research
03/18/2023

Stop Words for Processing Software Engineering Documents: Do they Matter?

Stop words, which are considered non-predictive, are often eliminated in...
research
09/09/2022

Pitfalls and Guidelines for Using Time-Based Git Data

Many software engineering research papers rely on time-based data (e.g.,...
research
07/31/2022

Editorial: Special Issue on Collaborative Aspects of Open Data in Software EngineeringJohan

High-quality data has become increasingly important to software engineer...

Please sign up or login with your details

Forgot password? Click here to reset