Science is facing a reproducibility crisis. Previous work has proposed
i...
Social influence is a strong determinant of food consumption, which in t...
Prostate cancer pathology plays a crucial role in clinical management bu...
Generative language models (LMs) have become omnipresent across data sci...
Recent advances in artificial intelligence (AI) have produced highly cap...
Research using YouTube data often explores social and semantic dimension...
Large language models (LLMs) are remarkable data annotators. They can be...
With 60M articles in more than 300 language versions, Wikipedia is the
l...
Large Language Models (LLMs) have democratized synthetic data generation...
LLMs have shown impressive few-shot performance across many tasks. Howev...
Wikipedia is a well-known platform for disseminating knowledge, and
scie...
Wikipedia, in its role as the world's largest encyclopedia, serves a bro...
Language models (LMs) have recently shown remarkable performance on reas...
Large language models (LLMs) show great potential for synthetic data
gen...
Automated audits of recommender systems found that blindly following
rec...
According to journalistic standards, direct quotes should be attributed ...
Online social media platforms use automated moderation systems to remove...
A critical component of a successful language generation pipeline is the...
For the quantitative monitoring of international relations, political ev...
In surveys, it is typically up to the individuals to decide if they want...
Differential privacy (DP) is a widely applied paradigm for releasing dat...
A large body of work shows that machine learning (ML) models can leak
se...
Automatic evaluation metrics capable of replacing human judgments are
cr...
Given that measuring food consumption at a population scale is a challen...
The design of online platforms is both critically important and challeng...
We describe a novel approach to explainable prediction of a continuous
v...
There is a widespread belief that the tone of US political language has
...
The use of attributed quotes is the most direct and least filtered pathw...
Named entity linking (NEL) in news is a challenging endeavour due to the...
Candidate generation is a crucial module in entity linking. It also play...
We introduce and tackle the problem of automatically generating short
de...
Differential privacy (DP) is a widely used notion for reasoning about pr...
One of the key emerging roles of the YouTube platform is providing creat...
"Wiki rabbit holes" are informally defined as navigation paths followed ...
Emojis come with prepacked semantics making them great candidates to cre...
Currently, publicly available models for website classification do not o...
Every day millions of people read Wikipedia. When navigating the vast sp...
Despite the importance and pervasiveness of Wikipedia as one of the larg...
Structured and grounded representation of text is typically formalized b...
Evaluation in NLP is usually done by comparing the scores of competing
s...
Modern pretrained language models are critical components of NLP pipelin...
Understanding the origins of militarized conflict is a complex, yet impo...
Recent research suggests that not all fact checking efforts are equal: w...
The automatic detection of humor poses a grand challenge for natural lan...
Entity linking is an important problem with many applications. Most prev...
Time series with missing data are signals encountered in important setti...
The learning of a new language remains to this date a cognitive task tha...
Political polarization appears to be on the rise, as measured by voting
...
Researchers have suggested that "the Manosphere," a conglomerate of
men-...
Wikipedia, the largest encyclopedia ever created, is a global initiative...