The outbreak of the novel Coronavirus Disease 2019 (COVID-19) has influenced millions of people around the world . People’s lives are at risk due to the fact that COVID-19 is highly infectious from person to person. It is reported that the globally confirmed cases of COVID-19 have surpassed 10 milloin and more than 500,000 people have lost their lives due to the infection until 29 June 2020 111https://news.google.com/covid19/map?hl=en-AU&gl=AU&ceid=AU%3Aen. Many countries and jurisdictions have taken a series of specific measures to help to slow down the spread of COVID-19, such as travel ban, lockdown, closure of public places (e.g., gym, restaurants, schools), requiring people to practise good personal hygiene, keep physical social distancing of 1.5 meters, and work or study at home to reduce contact with others. The above measures have changed people’s daily lives, and most people have to follow the policies to protect the safety of their communities. However, many mental symptoms like worry, fear, frustration, depression and anxiety could occur and cause serious mental heath issues to people due to the long-time social activity restriction during the pandemic period . Therefore, understanding when and where people would experience mental issues in this special period is important for governments to make the right decisions.
On the other hand, people could spend more time on social media platforms like Twitter to get the latest news, communicate with friends and post their feelings and thoughts during the lockdown period. Such massive personal posts from social media could become invaluable data sources for large-scale sentiment and topic mining for monitoring people’s mental health across different events or topics . Hence, many recent work focuses to detect the social sentiment for people due to COVID-19 [24, 10, 15]. For instance, Zhou et al.  adopted Twitter data for massive sentiment analysis due to COVID-19 for people living New South Wales Australia. Han et al.  studied the public opinion in the early stages of COVID-19 in China by analyzing text data from Sina Weibo. Rajesh et al.  exploited topic model to generate topics from tweets related to coronavirus and calculated the presence of different emotions. However, little work is found to analyze the topic and sentiment dynamics together, even though such dynamical analysis is important for authorities to understand when and where people would experience mental health issues.
In this work, we aim to dynamically identify the popular topics and their associated sentiment polarities due to the COVID-19 pandemic. Our research questions are: (1) How is the dynamics of people’s sentiment? (2) Which topics are mostly discussed by people? (3) How is the evolution of different topics? (4) How is the sentiment dynamics of topics? To answer these questions, we propose a novel dynamic topic discovery and sentiment analysis framework which contains multiple modules include data crawling, data cleaning, topic discovery, sentiment analysis and result visualization. In the proposed framework, we employ the Dynamic Topic Model (DTM) 
to generate accurate daily topics. To determine the sentiment polarity of each topic and tweet, we utilize a sentiment lexicon tool: VADER 222https://github.com/cjhutto/vaderSentiment to infer the sentiment polarity. We collect 13,746,822 tweets from 1 April 2020 to 14 April 2020 related to COVID-19 from Twitter across the world to test the effectiveness of the proposed framework. The experimental results show that the proposed framework can generate insightful findings such as the overall sentiment dynamics among people, topic evolutionary patterns, the sentiment dynamics of different topics.
2 Related work
With the spreading of COVID-19 across the world, researchers have proposed to use sentiment analysis based on social media as a tool to monitor people’s mental health. In this section, we review the latest work related to COVID-19 analysis based on social media data.
Rajesh et al. adopted a classic Latent Dirichlet Allocation (LDA) topic model method to generate 10 topics in a random sample of 18,000 tweets about coronavirus, then they used NRC sentiment dictionary to calculate the presence of eight different emotions, which were “anger”, “anticipation”, “disgust”, “fear”, “joy”, “sadness”, “surprise” and “trust”. Han et al.  explored public opinion in the early stages of COVID-19 in China by analyzing Sina-Weibo texts. The COVID-19 related micro-blogs were generalized into 7 topics and 13 more detailed sub-topics. However, they judged the polarity of the topics according to the polarity of the topic words and failed to consider the specific tweets under this topic. Cinelli et al.  analyzed engagement and interest in the COVID-19 topic on different social media platforms such as Twitter, Instagram, Reddit, and provided a diﬀerential assessment of the global discourse evolution of each platform and their users. They found that reliable and suspicious information sources have similar spreading patterns. Depoux et al.  confirmed that the spread of social media panic is faster than that of COVID-19. Therefore, the public rumors, opinions, attitudes and behaviors surrounding COVID-19 need to be quickly detected and responded to. They suggested to create an interactive platform dashboard to provide real-time alerts of rumors and concerns about the spread of coronavirus worldwide, which would enable public health officials and relevant stakeholders to respond quickly with a proactive and engaging narrative that can mitigate misinformation. Sharma et al.  designed a dashboard to track misinformation in Twitter conversations. The dashboard allows visualization of the social media discussions around coronavirus and the quality of information shared on the platform and updated over time. They evaluated sentiment polarity for each tweet based on its textual information and showed the distribution of sentiment in different countries over time.
More recently, Huang et al.  examined the public discussion concerning COVID-19 on Twitter and found that the most influential tweets are still written by regular users, such as news media, government officials, and individual news reporters. They also discovered that “fake news” sites are most likely to be retweeted within the source country and so are less likely to spread internationally. Zhou et al.  exploited tweets on Twitter to analyse the sentiment dynamics of people living in the state of New South Wales (NSW) in Australia during the pandemic period. They first summarized that the overall polarity of the community since the outbreak was positive, and then analyzed the sentiment dynamics of the NSW local government areas in terms of lockdown, social distance and JobKeeper.
Different from the above work that either performed static sentimental analysis or failed to analysis the detailed topic-level sentiment. We propose to analyse the dynamics of topic-level sentiment to monitor the evolution of people’s mental states across different topics or events.
3 Proposed Framework
In this section, we introduce the proposed framework and the adopted techniques for detecting topic and sentiment dynamics. Our framework contains three steps for the task. Firstly, we divide the tweets into different topics generated by a dynamic topic model, then we determine the sentiment polarity of each tweet, and finally we summarize the sentiment polarity distribution of each topic. The proposed framework is shown in figure 1.
3.1 Topic Extraction
Topic modelling is the process of learning, recognizing, and extracting high-level semantic topics across a corpus of unstructured text. A popular method for this is Latent Dirichlet Allocation (LDA) proposed by Blei et al. , which is used to detect topics for static corpus. Later, a Dynamic Topic Models (DTM)  is proposed based on LDA to mine topic evolution over time by extending the idea of LDA to allow topic mining over fixed time intervals. Hence, DTM can handle sequential documents and generate topics for different time slices of corpus. Specifically, the documents within each time slice are modeled with the static LDA, the topics associated with slice evolve from the topics associated from the previous time slice . The dynamics of the topic model is given by Eq. 1 and 2, table 1 shows the meaning of the notations. DTM mines topics of each time slice with LDA and its parameters and are chained together in a state space model which evolve with Gaussian noise to get a smooth evolution of topics from time to time.
|as the per-document topic distribution at time t.|
|as the word distribution of topic k at time t.|
|as the topic distribution for document d in time t.|
|as the topic for the word in document d in time t.|
|as the word at time slice , document d.|
3.2 Tweets Sentiment Analysis
Sentiment Analysis (SA) also commonly referred as Opinion Mining(OM) that aims to find opinionated information and detect the sentiment polarity. Nowadays, SA techniques are quite mature and many tools are openly available, such as Stanford’s CoreNLP , VADER , SentiStrength , SentiCircles , which are specifically designed for social media conversation. In this work, we employ VADER to identify the sentiment polarity of each tweet. VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiment expressed in social media. It was introduced by Hutto et al.in 2014 and has been widely used in many social media text sentiment analysis tasks [7, 5, 9, 23, 18, 24]
. VADER can classify the sentiment into negative, neutral, and positive categories and employ compound score which is computed by summing the valence scores of each word in the lexicon and normalized in range (-1,1), where “-1” represents most extreme negative and “1” represents most extreme positive. If compound score is greater than 0.05, the text is considered positive, if the score is less than -0.05, it is considered negative, if the score is between 0.05 and -0.05, the text polarity is neutral. The biggest advantage of VADER is that it does not require data preprocessing and model training, and can be used directly on the raw tweet to generate sentiment polarity. Some examples of tweet sentiment results from VADER are shown in figure2. In this work, we use VADER to produce the sentiment polarity of each original tweet, namely positive, negative and neutral. And then, combine the topic mining result from DTM to analyze the topic-level sentiment.
3.3 Measuring Topic Sentiment
After the above two steps, all tweets will be clustered in the corresponding topics, and marked with sentiment polarity. The sentiment polarity is aggregated over tweets to estimate the overall sentiment distribution. For each topic per day, we sum up the number of positive, negative and neutral tweets in the topic, thus, each topic is associated with three sentiment counts.
For each topic in the studied days, we define a sentiment distribution in Eq. 3 and 4. is the total studied days and is the total topic number. represents the total number of tweets under topic in day , , and respectively denote the positive, negative and neutral tweet counts. represents the total number of tweets in our dataset. We observe the daily hot topics about COVID-19 on Twitter (April 1 to April 14), and analyze the sentiment polarity distribution of each topic. The details are presented in Section 4.3.
4 Experimental Study
In this section, we will introduce the processes of data collection and data preprocessing, and how to determine the optimum topic number in DTM.
4.1 Data Collection and Preprocessing
The data collection process is described as follows. Firstly, we obtain tweet IDs from the public available coronavirus Twitter dataset333https://github.com/echen102/covid-19-tweetids that collected by Chen et al.  from Twitter API444https://developer.twitter.com/en/docs/api-reference-index based on keywords such as Coronavirus, Covid, Covid19, Wuhanlockdown and account names such as CDCemergency, CDCgov, WHO, HHSGov to actively tracking tweets from Twitter. We collect totally 13,746,822 tweets with Tweepy (a python library for the Twitter API) based on the given tweet IDs from April 1 to April 14. After that, the non-English tweets in the tweet dataset are removed, and we get a total of 8,430,793 English tweets.
Data preprocessing or cleaning is the first step for text mining task . It includes converting all letters into lowercase, removing stop words, non-English letters, URLs, etc. Then, phrase extraction is used to ensure that words such as “human_rights” could be one token instead of separating “human” and “rights”. Additionally, lemmatization is also adopted to remove inflectional endings and return the base or dictionary form of a word. After preprocessing, we remove very short tweets (less than 6 words) and retained a total of 4,919,471 tweets with 269,391 unique tokens. Figure 3 shows the ratio of raw tweets, English tweets, and finally adopted tweets. Table 2 shows the number of daily tweets used for the experiment.
|Date||April 1||April 2||April 3||April 4||April 5||April 6||April 7|
|Date||April 8||April 9||April 10||April 11||April 12||April 13||April 14|
4.2 Topic Model Setup
The number of topics is a crucial parameter in topic modeling and capable of making these topics human interpretable. According to , for the first slice of Dynamic Topic Models(DTM) to get setup, we fit LDA on the first day of the dataset to learn the best topic number. We employ Gensim package in Python to train LDA model for the selection of best topic number. Here, we use the coherence  by measuring the degree of semantic similarity between high scoring words of topics as an indicator to choose the best topic number. The coherence score helps distinguish between human understandable topics and artifacts of statistical inference.
where select top frequently occurring words in each topic, then aggregate all the pairwise scores of the top words of the topic.
Figure 4 shows the coherence scores of different topic numbers on the LDA model based on one day tweets. As we can see, when the topic number is 70, the coherence score gets the maximum value that is around 0.39. Therefore, we assume the total number of topics is stable and set the topic number for DTM as 70.
4.4 Overall Sentiment Dynamics
Figure 6 presents the overall sentiment distribution on Twitter during the study period. The number of tweets about the COVID-19 is around 350,000 per day, and the daily number of positive and negative tweets is similar but all greater than the number of neutral tweets. In most days, the number of positive sentiment tweets is slightly higher than negative sentiment tweets. This shows that despite the spread of COVID-19, the community showed a dominant positive sentiment during the study period. This observation is also consistent with the finding of other researchers who have reported the similar conclusions based on country-level sentiment analysis [1, 13, 24].
4.5 Daily Hot Topics
For daily topics discussed by users, we use DTM to generate 70 topics and then observe the popularity of topics. Table 3 shows the extracted top 10 highest volume topics with the most relevant words associated with the topics. To find the hottest topic discussed by people in the studied period, we rank the topics by their corresponding tweet number. Figure 7 shows the index of the top 10 topics for each day, topics are sorted by their volume from top row to bottom row in figure 7. As we can see, topics 11, 49, and 64 are steadily ranked as the top 3 of daily hot topics.
|Topic 64||Topic 49||Topic 11||Topic 26||Topic 31|
|Topic 43||Topic 16||Topic 66||Topic 56||Topic 40|
|(work)||(social_distancing)||(stop spread)||(face mask)||(health care)|
Figure 8 shows the most significant words that are associated with top three topics. Topics 11, 49, and 64 reflect the common concerns discussed by people, they are about staying at home to ensure safety, the latest case reports and people death due to the disease.
To further analysis of these three topics, we hope to know people’s opinions on these topics. We analyze the proportion of sentiment polarity of tweets under these topics to dynamically observe people’s sentiment changes as the pandemic spreads, the results are shown in figures 9 to 11. Overall, the topic’s sentiment polarity various from topic to topic.
Topic 11 is related to staying at home, and our results show that the public kept a highly positive sentiment dominantly towards this measure to prevent infection. This may be because people work or study at home and enjoy more time with their families. They also had a positive belief to the combat against COVID-19 by the government and the society.
Topic 49 is about latest report about cases, the dominant sentiment around the topic of cases was almost negative despite positive sentiment existing. As the pandemic spread to more countries, the number of confirmed and deaths continues to rise, and people feel that the actual situation is worse than they expected.
Topic 64 is about people’s death due to COVID-19, it shows a diametrically opposite result to topic 11, tweets with negative emotions are much higher than others. It is believed that the outbreak cannot be effectively controlled completely since policy makers ignored the seriousness of the pandemic, so they expressed strong dissatisfaction with this consequence.
5 Conclusions and Future Work
This study conducted a comprehensive analysis about hot topics with associated sentiment polarity distribution during COVID-19 period. Instead of the country-level study, the sentiment in this work was analysed at global-level on more than 13 million tweets collected from Twitter for two weeks from 1 April 2020. The results of analysis showed that people concern about the latest confirmed coronavirus cases, measures to prevent infection, the attitudes and specific measures of governments towards the pandemic. The overall sentimental polarity was positive, but topic sentiment polarity various from topic to topic. More interesting topics can be explored based on the current study in the future. For example, more specific topics can be analyzed to help policy maker, government and local communities to formulate measures to prevent the spread of negative emotions on social network, such as food shortage, lose job, debt, study at home.
-  (2020) Sentiment analysis of social media response on the covid19 outbreak. Brain, Behavior, and Immunity. Cited by: §4.4.
Dynamic topic models.
Proceedings of the 23rd international conference on Machine learning, pp. 113–120. Cited by: §1, §3.1, §4.2.
-  (2003) Latent dirichlet allocation. Journal of machine Learning research 3 (Jan), pp. 993–1022. Cited by: §3.1.
-  (2020) Covid-19: the first public coronavirus twitter dataset. arXiv preprint arXiv:2003.07372. Cited by: §4.1.
-  (2017) Anyone can become a troll: causes of trolling behavior in online discussions. In Proceedings of the 2017 ACM conference on computer supported cooperative work and social computing, pp. 1217–1230. Cited by: §3.2.
-  (2020) The covid-19 social media infodemic. arXiv preprint arXiv:2003.05004. Cited by: §2.
-  (2017) Automated hate speech detection and the problem of offensive language. In Eleventh international aaai conference on web and social media, Cited by: §3.2.
-  (2020) The pandemic of social media panic travels faster than the covid-19 outbreak. Oxford University Press. Cited by: §2.
-  (2015) Measuring emotional contagion in social media. PloS one 10 (11), pp. e0142390. Cited by: §3.2.
-  (2020) Using social media to mine and analyze public opinion related to covid-19 in china. International Journal of Environmental Research and Public Health 17 (8), pp. 2788. Cited by: §1, §2.
-  (2020) Disinformation and misinformation on twitter during the novel coronavirus outbreak. arXiv preprint arXiv:2006.04278. Cited by: §2.
-  (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media, Cited by: §1, §3.2.
-  (2020) Estimating geographic subjective well-being from twitter: a comparison of dictionary and data-driven language methods. Proceedings of the National Academy of Sciences 117 (19), pp. 10165–10171. Cited by: §4.4.
The stanford corenlp natural language processing toolkit. In Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp. 55–60. Cited by: §3.2.
-  (2020) Informational flow on twitter–corona virus outbreak–topic modelling approach. International Journal of Advanced Research in Engineering and Technology (IJARET) 11 (3). Cited by: §1, §2.
-  (2015) Exploring the space of topic coherence measures. In Proceedings of the eighth ACM international conference on Web search and data mining, pp. 399–408. Cited by: §4.2.
-  (2014) Senticircles for contextual and conceptual semantic sentiment analysis of twitter. In European Semantic Web Conference, pp. 83–98. Cited by: §3.2.
-  (2020) Coronavirus on social media: analyzing misinformation in twitter conversations. arXiv preprint arXiv:2003.12309. Cited by: §2, §3.2.
-  (2010) Sentiment strength detection in short informal text. Journal of the American society for information science and technology 61 (12), pp. 2544–2558. Cited by: §3.2.
-  (2020) Coronavirus disease (covid-19) pandemic. Note: [Online; accessed 2020-05-15] External Links: Cited by: §1.
-  (2019) Discovering Topic Representative Terms for Short Text Clustering. IEEE Access 7, pp. 92037–92047. External Links: Cited by: §1.
-  (2020) Short text similarity measurement using context-aware weighted biterms. In Concurrency Computation, External Links: Cited by: §4.1.
-  (2016) Cross-modality consistent regression for joint visual-textual sentiment analysis of social multimedia. In Proceedings of the Ninth ACM international conference on Web search and data mining, pp. 13–22. Cited by: §3.2.
-  (2020) Examination of community sentiment dynamics due to covid-19 pandemic: a case study from australia. CoRR abs/2006.12185. External Links: Cited by: §1, §1, §2, §3.2, §4.4.