Can you Trust the Trend: Discovering Simpson's Paradoxes in Social Data

01/13/2018
by   Nazanin Alipourfard, et al.
0

We investigate how Simpson's paradox affects analysis of trends in social data. According to the paradox, the trends observed in data that has been aggregated over an entire population may be different from, and even opposite to, those of the underlying subgroups. Failure to take this effect into account can lead analysis to wrong conclusions. We present a statistical method to automatically identify Simpson's paradox in data by comparing statistical trends in the aggregate data to those in the disaggregated subgroups. We apply the approach to data from Stack Exchange, a popular question-answering platform, to analyze factors affecting answerer performance, specifically, the likelihood that an answer written by a user will be accepted by the asker as the best answer to his or her question. Our analysis confirms a known Simpson's paradox and identifies several new instances. These paradoxes provide novel insights into user behavior on Stack Exchange.

READ FULL TEXT
research
10/24/2017

Computational Social Scientist Beware: Simpson's Paradox in Behavioral Data

Observational data about human behavior is often heterogeneous, i.e., ge...
research
12/15/2022

Best-Answer Prediction in Q A Sites Using User Information

Community Question Answering (CQA) sites have spread and multiplied sign...
research
10/08/2020

From Asking to Answering: Getting More Involved on Stack Overflow

Online knowledge platforms such as Stack Overflow and Wikipedia rely on ...
research
02/06/2019

Modeling and Analysis of Tagging Networks in Stack Exchange Communities

Large Question-and-Answer (Q&A) platforms support diverse knowledge cura...
research
08/21/2022

Friendliness Of Stack Overflow Towards Newbies

In today's modern digital world, we have a number of online Question and...
research
05/08/2018

Using Simpson's Paradox to Discover Interesting Patterns in Behavioral Data

We describe a data-driven discovery method that leverages Simpson's para...
research
01/28/2016

SculptStat: Statistical Analysis of Digital Sculpting Workflows

Targeted user studies are often employed to measure how well artists can...

Please sign up or login with your details

Forgot password? Click here to reset