The Ivory Tower Lost: How College Students Respond Differently than the General Public to the COVID-19 Pandemic

04/21/2020
by   Viet Duong, et al.
University of Rochester
0

Recently, the pandemic of the novel Coronavirus Disease-2019 (COVID-19) has presented governments with ultimate challenges. In the United States, the country with the highest confirmed COVID-19 infection cases, a nationwide social distancing protocol has been implemented by the President. For the first time in a hundred years since the 1918 flu pandemic, the US population is mandated to stay in their households and avoid public contact. As a result, the majority of public venues and services have ceased their operations. Following the closure of the University of Washington on March 7th, more than a thousand colleges and universities in the United States have cancelled in-person classes and campus activities, impacting millions of students. This paper aims to discover the social implications of this unprecedented disruption in our interactive society regarding both the general public and higher education populations by mining people's opinions on social media. We discover several topics embedded in a large number of COVID-19 tweets that represent the most central issues related to the pandemic, which are of great concerns for both college students and the general public. Moreover, we find significant differences between these two groups of Twitter users with respect to the sentiments they expressed towards the COVID-19 issues. To our best knowledge, this is the first social media-based study which focuses on the college student community's demographics and responses to prevalent social issues during a major crisis.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 4

page 5

page 7

10/15/2020

Understanding the Hoarding Behaviors during the COVID-19 Pandemic using Large Scale Social Media Data

The COVID-19 pandemic has affected people's lives around the world at a ...
10/08/2021

Sentiment Analysis and Topic Modeling for COVID-19 Vaccine Discussions

The outbreak of the novel Coronavirus Disease 2019 (COVID-19) has lasted...
09/16/2021

Social Disparities in Oral Health in America amid the COVID-19 Pandemic

We conduct a large-scale social media-based study of oral health during ...
08/12/2020

An Exploratory Study of COVID-19 Information on Twitter in the Greater Region

The outbreak of the Coronavirus disease (COVID-19) leads to an outbreak ...
09/16/2020

Exploring Speech Cues in Web-mined COVID-19 Conversational Vlogs

The COVID-19 pandemic caused by the novel SARS-Coronavirus-2 (n-SARS-CoV...
09/30/2021

Variance of Twitter Embeddings and Temporal Trends of COVID-19 cases

The severity of the coronavirus pandemic necessitates the need of effect...
04/12/2021

Breaking Community Boundary: Comparing Academic and Social Communication Preferences regarding Global Pandemics

The global spread of COVID-19 has caused pandemics to be widely discusse...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

First detected in Wuhan, China on December 31th, 2019, COVID-19, or the coronavirus, outbreak grew rapidly in scale and severity, and was officially declared as a pandemic on March 11th, 2020111https://edition.cnn.com/2020/02/06/health/wuhan-coronavirus-timeline-fast-facts/index.html. As of April 13th, the World Health Organization (WHO) reported 1,812,734 confirmed cases of COVID-19 worldwide, including 113,675 deaths222https://who.sprinklr.com/. Due to the novelty and intractability of the virus, the global community, particularly the elderly and those with underlying medical problems333https://www.who.int/health-topics/coronavirus, are at a high risk for serious health and safety hazard. However, we suspect that the younger and physically healthier population is just as susceptible, though in a different way, to COVID-19. In order to control the spread of the outbreak, non-pharmaceutical interventions and preventive measures such as social-distancing and self-isolation have been implemented worldwide out of utmost necessity, which has led to the large-scale shutdown of public gathering places. As members of an active working and learning society, people who dedicate most of their daily hours at workplaces and educational institutions are highly vulnerable to the impacts of the closure of these facilities.

This is especially true for college students. The response to the COVID-19 pandemic has brought a sudden disruption in the operations of schools, colleges and universities, influencing more than 1.7 billion students in 192 countries444https://en.wikipedia.org/wiki/Impact_of_the_2019-20_coronavirus_pandemic_on_education. Located at the epicenter of the pandemic, with 579,005 confirmed cases, including 22,252 deaths555https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html, the U.S educational system, which is one of the largest in the world, has taken the biggest hit. Beginning with the University of Washington, which closed its campus on March 7th, 2020 and moved classes online for its 50,000 students, many colleges have also immediately responded to this outbreak by cancelling all on-campus activities such as workshops, conferences, and sports, and relocating their in-person classrooms to online platforms. As of April 14th, 2020, more than 124,000 U.S. public and private schools have closed due to the virus, affecting at least 55.1 million students666https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html. This transition has introduced multiple challenges for students. The foremost concern is related to how the government and education system handle the pandemic crisis with the study-from-home approach. Past surveys suggested that students experience severe limitation on particular subjects that benefit from physical interaction with the materials, and tend to lose the ”pacing mechanism” of scheduled lectures, thus have a higher chance of dropping out than those in traditional settings [9, 22]. More recently, as the COVID-19 pandemic is unfolding, Sahu [29] hinted at other issues related to the closure of schools, apart from online learning, such as international students’ travel and students’ mental health. This motivates us to provide a more comprehensive study on the student demographics regarding their primary subjects of concern and how they are expressed, particularly amidst the COVID-19 crisis.

In this study, we attempt to explore the responses to the COVID-19 pandemic by Twitter users, with the focus on the college students. Also, we highlight our findings regarding the college student demographics by characterizing the outstanding differences in their behaviors from the general public. Such insights can be vital for educators and policy-makers to measure the effectiveness of their on-going efforts in the global fight against COVID-19 and the protection of our younger population. In addition, we train classification models to identify the demographics of users who posted tweets associated with COVID-19, as well as extracting the sentiments they inherently expressed in their posts. The models can be used in social media platforms to investigate central social problems, with respect to both their universality and degree of impact, and to draw the community’s attention towards high-priority targets for addressing such problems.

Our main contributions are several folds: (1) we approach the social issues related to COVID-19 across different demographics using social media, by collecting data from Twitter; (2) we deploy topic modeling methods on a novel dataset to highlight the topical patterns of the ongoing social media discussions during a major crisis, regarding two different user demographics on Twitter; and (3) our implementation of state-of-the-art transformer models for natural language inference achieves new state-of-the-art performance for the Twitter sentiment classification task, which allows meaningful insights on social media behaviors to be discovered reliably.

Ii Related Work

Our study draws knowledge from the body of research on characterizing the demographics of social media users, along the dimensions such as gender [27, 3], age [24, 31], and social class [31]. Methods on inferring the Twitter user demographics were typically reliant on mining fine-grained linguistic patterns from the user’s Twitter biography (short self-descriptive text) and posts, which are proven highly precise for certain attributes when properly constructed [3, 2]. These approaches have also been deployed with relatively strong performance for the college student demographics [12, 13]

. Due to the recent advance in neural networks for sequence and image classification, Wang et al. 

[34] were able to leverage a multimodal, multi-attibute, and multilingual approach to achieve the state-of-the-art accuracy on gender, age, and organization entity classification. Building upon the discoveries of previous works, we design and evaluate our own college student user classification method. This enables us to identify the two pools of users (college students and general public) among college followers on Twitter for the subsequent comparative analysis.

Research on sentiment analysis for Twitter textual data, which tackles the problem of analyzing the messages posted on Twitter in terms of the sentiments they express, has also been performed. Twitter is a very challenging domain for sentiment analysis mainly due to the length limitation of texts [10]

. The majority of past approaches employed a traditional machine learning methods such as logistic regression, SVM, MLP, etc., trained on lexicon features and sentiment-specific word embeddings (vector representations of words)

[17, 33]. More recent approaches typically opted for sequence learning models trained to learn relevant embeddings for classification from large pretrained word embeddings, particularly the GloVe [26] embeddings for Twitter data. Best performing models of this breed include Cliche (2017) [6] and Baziotis et al. (2017) [1], which shared the first place for sentiment analysis on Twitter (Task 4A [28]) at the International Workshop on Semantic Evaluation 2017 (SemEval-2017). The novelty in our approach to Twitter sentiment analysis involves the implementation of state-of-the-art transformer methods such as BERT [7] and RoBERTa [18], whose outstanding sentiment classification prowess remains untested for Twitter data in the literature.

Iii Data Collection and Preprocessing

Iii-a Data Collection

In this study, we limit the user population to those who follow the official Twitter accounts of colleges in the U.S. News 2020 Ranking of Top 200 National Universities. Relevant users identified as English speakers were collected using the Tweepy API777https://git.io/JvAjh. Since the lists of the followers of the colleges in consideration might overlap, simply collecting tweets from these users could create major data redundancy and time complexity problems. Therefore, we extract unique users from the combined results, as well as their personal information and profile images, to obtain a dataset of 12,407,254 unique users. This set of users is relatively large and 1,641,582 of of these users have Twitter protected accounts, which means we are not allowed to collect tweets (Twitter posts) from them. Thus, we randomly sample 100,000 users from the unprotected pool to represent the population of college followers for the subsequent tweet collection and text analysis.

Tweets were collected using the Tweepy API. We retrieved a total of 1,873,022 tweets from the 100,000 user samples posted within the timeframe between January 20th, when the first COVID-19 case was confirmed in the U.S., and March 20th of 2020 to cover a two-month period, when nationwide social distancing protocol and school closure were attracting mass concerns. We then extracted tweets related to COVID-19, with a list of keywords consisted of ”corona”, ”#Corona, ”#coronavirus”, ”covid-19”, ”covid19”, ”coronavirus”, ”#Covid_19”, ”chinese virus”, and ”#ChineseVirus”. As a result, we obtain 73,787 unique COVID-19 related tweets, pertaining to 12,776 users, whom in this study we will address as affected users. In addition, tweets that are not related to COVID-19 of the 12,776 affected users are kept for the student inference task.

Iii-B Text Preprocessing

We develop a text preprocessing pipeline similar to that of Baziotis et al. [1] to ensure that our text dataset is to a high degree lexically comparable to natural language, and include COVID-19 domain-specific word knowledge from a novel dataset. This is done by performing sentiment-aware tokenization, spell correction, word normalization, segmentation (for splitting hashtags) and token annotation. They implemented a tokenizer with the SentiWordnet corpus [8], which is capable of avoiding splitting expressions or words that should be kept intact (as one token), and identify most emoticons, emojis, expressions such as dates, currencies, acronyms, censored words (e.g. s**t), etc. In addition, we perform spelling correction on the extracted tokens by composing a dictionary for the most commonly seen abbreviations, censored words and elongated words (for emphasis, e.g. ”reallyyy”). The Viterbi algorithm is used for word segmentation, with word statistics (unigrams and bigrams) computed from a recently published Twitter dataset of 50 million English tweets related to COVID-19 [4]

, to obtain the most probable segmentation posteriors. Moreover, all texts are lower-cased, while URLs, emails and mentioned usernames are annotated with common designated tags and removed to retain the natural language elements from the text data. The processed tweets are then annotated by the Standford CoreNLP English annotator

[21], which uses syntactic constituency and dependency tree parsing to extract the appropriate part-of-speech (POS) tags and lemmas (the base/dictionary forms of words) from the tweet tokens.

Iv inference of College student demographics

Iv-a Extracting Age, Gender and Organization Attributes from Twitter User Profiles

We consider age, gender and organization entity to be highly descriptive attributes to first obtain a general view of our user samples. According to National Center for Education Statistics (NCES) 888https://nces.ed.gov/

, as of Fall 2017, 56.2% of enrolled students aged between 19 and 29 years old, 20.1% are under 18, and 56.6% of them were female. These student demographic statistics are projected by NCES to remain consistent through 2020. Also, organizational Twitter accounts apparently should not be targeted for student inference because college students are individuals. These attributes are extracted using the M3 (Multilingual, Multimodal, Multi-attribute) deep learning system for inferring the demographics of users from four sources of information from Twitter profiles: user’s name (first and last name in natural language), screen name (Twitter username), biography (short self-descriptive text), and profile image

[34]. We extract 1,111 organization entities from 12,776 affected users, and disregard them from comparative analysis since they are not individuals. Also, the gender and age attributes are used to verify our classification results.

Fig. 1: The M3 Model for Inferring Gender, Age, and Organization-identity from Image and Text Data [34].

Although the M3 Model is highly robust for gender (0.918 Macro-F1) and organization entity (0.898 Macro-F1) recognition without using tweets, the distinguishing features for these attributes are widely available in the name, profile image, and description of Twitter users, which is not necessarily the case for the college student demographics. Since attributes from the user’s name and photo image are not necessarily indicative of college students, we attempt to retrieve the students by matching the word ”student” with the Twitter biographies, and found only 248 matches (2.22% of the non-organizational users). We also matched the keywords directly related to their degree status (such as BS, MS, MBA, PhD, etc.) to their Twitter biography, and found 335 (3%). While college students are likely to mention their degree status rather than current occupation, and many users mention the names of colleges in their biography, these should not be deciding factors because they might just be college alumni or professors instead of actual students. Due to the fact that the users are already college followers, we expect much higher percentages of college students. Thus, the use of tweets for college student identification is critical to our study.

Iv-B Heuristically Identifying College Students Using Tweets

Iv-B1 Gold-Standard Annotations

We sample 2,400 random users from the 11,165 non-organizational affected users and includes their names, profile images, biographies, and tweets from 1/20 to 3/20/2020. This information is used by human annotators999Annotators are IRB certified for Social-Behavioral-Educational Research to answer the prompt: ”Would you think this person is a COLLEGE STUDENT?” with two response options: ”Yes” or ”No”.

Iv-B2 Supervised Classification

We encode the standard Bag of N-grams (for 1 up to 4-grams) representation of the user’s tweets, which has been highly effective in text categorization and information retrieval

[30]

, to use them as features for our classifiers. To increase the generality of our Bag of N-grams features, we preprocess the tweets as described above and apply TF-IDF vectorization, a term re-weighting scheme that discounts the influence of common terms. We train a Random Forest classifier and report the accuracy: the percentage of correctly labeled users on 20% of the labeled samples.

Iv-B3 Using Heuristic to Override the Classifier

Regarding the self-distinguishing attributes of Twitter users from tweets, Bergsma and Van Durme [3] discovered that users most frequently reveal their attributes in the possessive construction, that is “my X” where X is an attribute, quality or event that they possess (in a linguistic sense). As a matter of fact, we found 306 tweets with the phrase “my class” among the 1,156,947 tweets from non-organizational users. On the contrary, phrases like ”I have/had (a) class(es)” occur only 16 times. Therefore, we extract this ”my X” attribute type for the college student demographic as follows: we first part-of-speech tag our data using the Stanford CoreNLP tagger and then look for “my X” patterns where X is a sequence of tokens terminating in a noun. To calculate the association between the attributes and the college student demographic, we compute the pointwise mutual information [5] between each attribute A and student over the set of occurrences. If , the observed probability of a student and attribute co-occurring is greater than the probability of co-occurrence that we would expect if student and attribute A were independently distributed.

(1)

We employ two techniques for selecting distinctive attributes for college students: (1) we rank the attributes by their PMI scores and use a threshold to select the top-ranked attributes; (2) we manually filter the remaining set of attributes to select those that are judged to be discriminative, including phrases closely associated with college students such as ”my zoom class”, ”my professor”, ”my dorm”, etc. Then we use a simple heuristic to use our identified self-distinguishing attributes in conjunction with a classifier trained on gold-standard annotations: If the user has any self-distinguishing ”my-X” attributes, we assign the user to be a college student; otherwise, we trust the output of the classifier. We apply this rule to bootstrap the knowledge learned by the classifier in conjunction with our domain-specific attributes to automatically label the unannotated users. In the end, we verify the performance of our heuristic on the same test set as the classifier.

Iv-B4 Summary of Results

As previously discussed, the Random Forest classifier, trained on 1,920 examples, performs quite well with Bag of N-grams features by correctly labeling 78% of the college students on the test set. We experiment with our ”my-X” attributes and set the PMI threshold to 0.5, and then manually filter out the irrelevant attributes. Applying our heuristics to override the classifier improve the accuracy further to 83%. Therefore, we have firm grounds to utilize the combined classifier and ”my-X” heuristics to label college students from the remaining users, which account for an additional 2,575 out of the total of 3,460 college student users (31% of the non-organizational users).

Fig. 2: Gender and Age Distributions of 3,460 College Student Users.

Looking at the age and gender distributions of the college students in our samples (Figure 2), the statistics are very consistent with real world data in the U.S. as 53.8% of the college students we identified are female, which is very close to the 56.7% female percentage predicted by NCES for 2020. Although age classification is a challenging task for the M3 model (0.522 Macro-F1) and even human [34], our results are still within a reasonable margin with NCES’s 2020 projection, with 54.1% of the students in the 19-29 age group (vs. 56.7%) and 28.6% under 18 (vs. 21.2%).

V Topical Analysis of Covid-19 Tweets

Fig. 3: Topic-wise and Overall Frequency of the Top 20 Topic Keywords

In order to understand the latent topics of the COVID-19 tweets for college followers, we utilize Latent Dirichlet Allocation (LDA) [14] to label universal topics demonstrated by the users. To reduce the complexity of the LDA analysis corpus, only the lemmas of tokens with POS tags of type noun, verb, adjective, and adverb are kept from the preprocessed tweets, because they possess the most meaningful contents related to the topics we are looking to discover. We not only look at individual tokens but also consider highly-correlated groups of two and three words. Thus, bigrams and trigrams are computed and added to the corpus. Since certain terms frequently appear in those COVID-19 tweets (e.g. virus, disease, infection, case, test, etc.), we transform our LDA corpus using TF-IDF vectorization. We finetune our LDA topic model and arrive at the optimal topic number of 55 and coherence score of 0.373. We also implement t-SNE dimensionality reduction technique [20] to transform the computed 3126-dimensional document-term topic posterior matrix into 2-dimensional data points. This allows us to observe a distinctive separation of the data points representing the topic clusters, as illustrated in the plot of the 6 most prevalent topics of the COVID-19 tweets (Figure 4).

Fig. 4: t-SNE Clustering of 6 Most Frequently Discussed LDA Topics.

We then label the 6 most frequently discussed topics using the top 20 weighted topic keywords (Figure 3). Evidently, global news is the most popular topic among the tweets, as the numbers of confirmed positive COVID-19 cases and deaths are constantly increasing globally. The presence of political discussions, as well as the controversy related to the Chinese origin of the virus, is very strong, due to the ongoing presidential election campaign in the US, which gives solid evidence that the COVID-19 pandemic is influencing our political picture. The third and fourth most frequent topics involve social distancing and the closure of colleges.

Fig. 5: Student Tweets Contribution towards the Top 6 Topics.

Regarding which of the topics attracts the most attention from the college community in comparison with the general public, we find that college students tend to give more responses to COVID-19 issues that particularly affect them. In other words, as most universities in the U.S announced the shutdown of on-campus activities and encouraged students to refrain from crowded travel and commute during March, colleges students posted more tweets related to school closure (32.04%) and news from areas close to their living proximity (33.56%). In addition, they were concerned about social distancing and controversies regarding the address of the COVID-19 virus as ”Chinese virus”, which are two very important topics that we will take a closer look in later analysis.

Vi Topic-based sentiment analysis

To expand the scope of our study from the topic modeling results, we decide to dive deeper into the posts belonging to the each of the 6 most frequently discussed topics. Specifically, for each topic, we separate the college students and general population into two pools and apply the RoBERTa model to classify and examine the sentiments they expressed. Also, we use the same topic modeling techniques as described in Section V to provide microscopic explanations to the sentiment results.

Vi-a Transformer Models for Sentiment Classification

Vi-A1 RoBERTa - Robustly Optimized BERT Pretraining

BERT (Bidirectional Encoder Representations from Transformers) [7], was designed to pretrain deep bidirectional representations of tokens from unlabeled text by jointly conditioning on both left and right context in all layers. This was achieved by using a ”masked language model” (MLM), whose pretraining objective is to predict the randomly masked tokens of the sequence input. As a result, the pretrained BERT model can be finetuned with just one additional output layer to bring substantial performance gains for a wide range of language inference tasks, including sentiment classification, without extensive task-specific architecture modifications. Recently, Liu et al. [18] provided a replication study of BERT pretraining, and discovered that BERT was significantly undertrained, yet it can still match the performance of every model published following its inception. Thus, they presented additional insights on the design choices and training strategies of BERT and introduced alternative BERT-based models (RoBERTa) that record state-of-the-art results on similar tasks. They attributed their success to the use of a larger dataset for pretraining, and better design choices for MLM. They reported 0.948 F1 score of their RoBERTaBASE model for SST-2, a Stanford Sentiment Treebank (SST) dataset with binary labels (positive and negative) for sentiment analysis task. In comparison, BERTBASE ”only” achieved 0.928 F1 score. Therefore, we choose RoBERTa as the pretraining procedure for our Twitter sentiment analysis model, as well as comparing its performance with BERT.

Vi-A2 Training and Evaluation

We utilize the transformers101010https://git.io/JfUEh library by huggingface [35], which includes RoBERTaBASE and BERTBASE

in Pytorch

[25], and implement the sentiment analysis models with an additional linear layer on top of the pretrained model’s outputs. The AdamW optimizer [19]

is used to optimize the cross-entropy loss function. We also use the

fastai111111https://www.fast.ai/ API [15]’s deep learning wrapper for Pytorch, which allows us to split the model’s layers into groups, in order to use discriminative finetuning and slanted triangular learning rates [16] for task-specific features (i.e. learning word embeddings, learning context embeddings, and learning sentiment outputs).

We train and evaluate our models on the SemEval-2017 Task 4A dataset for Twitter message sentiment classification on a 3-point scale: Negative, Neutral, and Positive (Table I). In the end, both of our classifier models substantially outperform the top two performers of SemEval-2017 (Table II) on the test dataset. In particular, the model with RoBERTa pretraining achieves above 0.8 Macro-F1 score, which demonstrates its robustness for Twitter sentiment classification, and applicability for exploring the sentiments of our COVID-19 tweets. Since we implement the RoBERTa model to classify ternary sentiment labels, the drop in performance compared to Liu et al.  [18]

is within expectation. The confusion matrix for our RoBERTa model is given in Table

III.

Dataset Negative Neutral Positive Total
Train 7,838(15.6%) 22,586(44.9%) 19,896(39.5%) 50,320
Test 3,959(32.4%) 5,894(48.2%) 2,365(19,4%) 12,218
TABLE I: SemEval-2017 Task 4A Dataset Statistics
Model Accuracy Macro-F1 Score
RoBERTa 0.806 0.806
BERT 0.757 0.757
LSTMsCNNs[6] 0.685
BiLSTMsAttention[1] 0.677
TABLE II: Performance Comparison between Previous Methods on Twitter Sentiment Classification and Ours
negative neutral positive
negative 3315 843 23
neutral 622 4643 458
positive 22 408 1884
TABLE III: Confusion Matrix of RoBERTa Model

Vi-B Analysis of Results

Vi-B1 A depressing outlook of COVID-19

Overall, a very small percentage of positive sentiments are expressed among the COVID-19 tweets (lightest-colored blocks of Figure 6). In addition, more than one in five people of our user samples discussed COVID-19 related issues in a negative light. Considering that 2,281 out of a million of the U.S population are physically affected by COVID-19, which is already dangerous, the amount of negativity exhibited on Twitter is very alarming as well. Evidently, not only is the COVID-19 pandemic a health and safety hazard, it also has gloom-ridden impacts on our society. Moreover, for the topic related to the ”Chinese virus” controversy, there is an overwhelming number of negative responses. We can see from Figure 3 that ”racist” is the 3rd most frequent keywords of this topic, which suggests that many of Twitter users associated calling ”coronavirus” the ”Chinese virus” with racism.

Fig. 6: Sentiment Distributions (%) towards the 6 Most Frequent COVID-19 Topics. Percentage Blocks from Bottom to Top: Negative, Neutral, Positive.

Vi-B2 College students respond more negatively to COVID-19

An important trend that outweighs the rest of our results is that there is a significantly higher percentage among the student population expressing negative sentiments towards the central issues of COVID-19, especially on news related to the spread of the pandemic and social distancing. This shows that our analysis of the data is consistent with the our speculation on the impacts of the COVID-19 crisis on our younger population. College students are likely to express negative feelings towards how social distancing and school closure are affecting their work and study environments. Moreover, they tend to be subject to more negative emotions upon receiving news of the outbreak, which might be due to the subsequent implications of these issues on those more related to their lives.

Vi-B3 Negativity among College Students through the Topic Modeling Microscope

We focus on examining the subtopics of School Closing, which is of high concern among college students, and on Social Distancing and China Controversy, where highest gaps in the negative sentiments between college students and the general population are observed (14.5% and 13.8% absolute difference in percentage of negative tweets respectively). In addition, only negative and positive tweets are considered because they provide the most meaningful contexts associated with their sentiments. In general, non-neutral tweets on the Social Distancing and School Closing topics express worrying emotions towards COVID-19, and all the tweets revealing concerns on school closure are negative. Moreover, many students exhibited aggression to the foreign community, blaming them for the current disruptions in their lives as a result of social distancing. College students also disclosed details of their online learning experience, and mostly showed dislikes for remote learning (81.3%). To reflect on the responses of college students on COVID-19 in a more positive light, it is encouraging that our college community remains aware and vocal on the racism problem related to the ”Chinese virus” controversy, which sends a powerful message on the public’s intolerance of racist behaviors on social media for the betterment of our society.

These findings shed new lights on an emerging direction of racism problems in the U.S. during the COVID-19 outbreak, especially related to the East Asian community, in addition to the existing discussions in the literature that focus primarily on discrimination towards African American, Asian American, and more recently, Muslim population [32, 23, 11]. Also, in addition to addressing the uneasy feelings and barriers restricting the learning experience among students during the crisis, the prevention of racism-charged hate speeches is an important task for educational institutions to protect their students.

Subtopic Label Subtopic Keywords Negative Tweets
Showing aggression asia, people, stay, home, worse_european, everyone, piss, fucking, work, fight 81.5%
Detailing precautions worry, safe, world, tour, stay, people, cancel, take, wash_hand, precaution 65.1%
Expressing concerns sick, know, go, work, see, watch, really, think, grocery_store, family 85.5%
TABLE IV: Subtopics of Social Distancing
Subtopic Label Subtopic Keywords Negative Tweets
Detailing current situations knock, follow_government_instruction, feel, survival_rate_whole, country_panicking, get, people, fuck, right, week, campus 98.5%
Detailing remote study school, close, shut, student, find, get, campus, live_streaming_instead, email, anymore 81.3%
Expressing concerns people, due, cancel, go, concern, tell, week, imagine, nasty, break 100%
TABLE V: Subtopics of School Closing
Subtopic Label Subtopic Keywords Negative Tweets
Calling out racism chinese, president, call, racist, refer, flu, reply, people, fuck, trump 95.1%
Addressing attitudes people, take, perspective, sick, seriously, asian, friend, know, ass, time 97.3%
Detailing public response call, guy, get, keep, chinese, think, remember, wuhan, response, piss 91.7%
TABLE VI: Subtopics of China Controversy

Vii Conclusions and Future Work

We have analyzed 73,787 tweets from 12,776 Twitter college followers who posted tweets related the COVID-19 pandemic, in terms of the outstanding topics on several social issues. We find significant differences in the sentiments expressed towards those topics between the users who are identified as colleges students and those of the general population. College students tend to focus their discussions on topics closely surrounding their living environment, such as school closure and local news. Although the percentages of positive COVID-19 tweets are very low for both demographics, college students are shown to be significantly more negative. In addition, microscopic examination of the positive and negative tweets reveals their overwhelmingly troubled feelings amidst the spread of COVID-19, as well as unfavorable reactions to the disruption in their lives such as racism-charged aggression. Moreover, we discover a shift in the target of racism during COVID-19 towards the East Asian community, which the majority of college students and the general public are against.

Since high accuracy is achieved in both of our demographic and sentiment classification models, future studies may collect larger datasets to achieve better performance. In addition, this research mainly focuses on high-level attributes of tweets such as topic models and sentiments in understanding the characteristics of users who discussed social issues associated with COVID-19. Analysis on more fine-grained linguistic information, such as emotion, hate speech, and racism detection can be performed to gain further insights on the more specific COVID-19 related issues detailed in our study.

References

  • [1] C. Baziotis, N. Pelekis, and C. Doulkeridis (2017-08) DataStories at SemEval-2017 task 4: deep LSTM with attention for message-level and topic-based sentiment analysis. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 747–754. External Links: Link, Document Cited by: §II, §III-B, TABLE II.
  • [2] C. Beller, R. Knowles, C. Harman, S. Bergsma, M. Mitchell, and B. Van Durme (2014) I’ma belieber: social roles via self-identification and conceptual attributes. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 181–186. Cited by: §II.
  • [3] S. Bergsma and B. Van Durme (2013) Using conceptual class attributes to characterize social media users. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 710–720. Cited by: §II, §IV-B3.
  • [4] E. Chen, K. Lerman, and E. Ferrara (2020) COVID-19: the first public coronavirus twitter dataset. arXiv preprint arXiv:2003.07372. Cited by: §III-B.
  • [5] K. W. Church and P. Hanks (1990) Word association norms, mutual information, and lexicography. Computational linguistics 16 (1), pp. 22–29. Cited by: §IV-B3.
  • [6] M. Cliche (2017-08) BB_twtr at SemEval-2017 task 4: twitter sentiment analysis with CNNs and LSTMs. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), Vancouver, Canada, pp. 573–580. External Links: Link, Document Cited by: §II, TABLE II.
  • [7] J. Devlin, M. Chang, K. Lee, and K. Toutanova (2019-06) BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, Minnesota, pp. 4171–4186. External Links: Link, Document Cited by: §II, §VI-A1.
  • [8] A. Esuli and F. Sebastiani (2006) Sentiwordnet: a publicly available lexical resource for opinion mining.. In LREC, Vol. 6, pp. 417–422. Cited by: §III-B.
  • [9] L. V. Fedynich (2013) Teaching beyond the classroom walls: the pros and cons of cyber learning.. Journal of Instructional Pedagogies 13. Cited by: §I.
  • [10] A. Giachanou and F. Crestani (2016) Like it or not: a survey of twitter sentiment analysis methods. ACM Computing Surveys (CSUR) 49 (2), pp. 1–41. Cited by: §II.
  • [11] J. Guhin (2018) Colorblind islam: the racial hinges of immigrant muslims in the united states. Social Inclusion 6 (2), pp. 87–97. External Links: ISSN 2183-2803 Cited by: §VI-B3.
  • [12] C. L. Hanson, S. H. Burton, C. Giraud-Carrier, J. H. West, M. D. Barnes, and B. Hansen (2013) Tweaking and tweeting: exploring twitter for nonmedical use of a psychostimulant drug (adderall) among college students. Journal of medical Internet research 15 (4), pp. e62. Cited by: §II.
  • [13] L. He, L. Murphy, and J. Luo (2016) Using social media to promote stem education: matching college students with role models. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 79–95. Cited by: §II.
  • [14] M. Hoffman, F. R. Bach, and D. M. Blei (2010) Online learning for latent dirichlet allocation. In Advances in Neural Information Processing Systems 23, J. D. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. S. Zemel, and A. Culotta (Eds.), pp. 856–864. External Links: Link Cited by: §V.
  • [15] J. Howard and S. Gugger (2020) Fastai: a layered api for deep learning. Information 11 (2), pp. 108. Cited by: §VI-A2.
  • [16] J. Howard and S. Ruder (2018-07) Universal language model fine-tuning for text classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, pp. 328–339. External Links: Link, Document Cited by: §VI-A2.
  • [17] P. Korenek and M. Šimko (2014) Sentiment analysis on microblog utilizing appraisal theory. World Wide Web 17 (4), pp. 847–867. Cited by: §II.
  • [18] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov (2019) Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. Cited by: §II, §VI-A1, §VI-A2.
  • [19] I. Loshchilov and F. Hutter (2017) Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101. Cited by: §VI-A2.
  • [20] L. v. d. Maaten and G. Hinton (2008) Visualizing data using t-sne. Journal of machine learning research 9 (Nov), pp. 2579–2605. Cited by: §V.
  • [21] C. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. Bethard, and D. McClosky (2014-06)

    The Stanford CoreNLP natural language processing toolkit

    .
    In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Baltimore, Maryland, pp. 55–60. External Links: Link, Document Cited by: §III-B.
  • [22] G. R. Morrison, S. J. Ross, J. R. Morrison, and H. K. Kalman (2019) Designing effective instruction. John Wiley & Sons. Cited by: §I.
  • [23] S. D. Museus and J. J. Park (2015) The continuing significance of racism in the lives of asian american college students. Journal of College Student Development 56 (6), pp. 551–569. Cited by: §VI-B3.
  • [24] D. Nguyen, R. Gravel, D. Trieschnigg, and T. Meder (2013) ” How old do you think i am?” a study of language and age in twitter. In Seventh International AAAI Conference on Weblogs and Social Media, Cited by: §II.
  • [25] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. (2019) PyTorch: an imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, pp. 8024–8035. Cited by: §VI-A2.
  • [26] J. Pennington, R. Socher, and C. D. Manning (2014) Glove: global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1532–1543. Cited by: §II.
  • [27] D. Rao, D. Yarowsky, A. Shreevats, and M. Gupta (2010) Classifying latent user attributes in twitter. In Proceedings of the 2nd international workshop on Search and mining user-generated contents, pp. 37–44. Cited by: §II.
  • [28] S. Rosenthal, N. Farra, and P. Nakov (2019) SemEval-2017 task 4: sentiment analysis in twitter. arXiv preprint arXiv:1912.00741. Cited by: §II.
  • [29] P. Sahu (2020) Closure of universities due to coronavirus disease 2019 (covid-19): impact on education and mental health of students and academic staff. Cureus 12 (4). Cited by: §I.
  • [30] F. Sebastiani (2002) Machine learning in automated text categorization. ACM computing surveys (CSUR) 34 (1), pp. 1–47. Cited by: §IV-B2.
  • [31] L. Sloan, J. Morgan, P. Burnap, and M. Williams (2015) Who tweets? deriving the demographic characteristics of age, occupation and social class from twitter user meta-data. PloS one 10 (3). Cited by: §II.
  • [32] J. K. Swim, L. L. Hyers, L. L. Cohen, D. C. Fitzgerald, and W. H. Bylsma (2003) African american college students’ experiences with everyday racism: characteristics of and responses to these incidents. Journal of Black psychology 29 (1), pp. 38–67. Cited by: §VI-B3.
  • [33] D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, and B. Qin (2014) Learning sentiment-specific word embedding for twitter sentiment classification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1555–1565. Cited by: §II.
  • [34] Z. Wang, S. A. Hale, D. Adelani, P. A. Grabowicz, T. Hartmann, F. Flö”ck, and D. Jurgens (2019)

    Demographic inference and representative population estimates from multilingual social media data

    .
    In Proceedings of the 2019 World Wide Web Conference, Cited by: §II, Fig. 1, §IV-A, §IV-B4.
  • [35] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, and J. Brew (2019) HuggingFace’s transformers: state-of-the-art natural language processing. ArXiv abs/1910.03771. Cited by: §VI-A2.