Recent years have witnessed a paradigm shift in the way people consume news. Online news media has become more popular than the traditional newsprint, especially to younger news readers111http://news.bbc.co.uk/2/hi/business/8542430.stm. To further engage them, in addition to presenting news, online news platforms also allow readers to comment and share their points of view on the matter reported in stories. Irrespective of concerns about quality of the comments, especially their language and tone, comments are considered to be the most effective tool to increase reader engagements .
Several prior works in media and communication studies have highlighted the importance of discussions in the evolution of a democratic society. In a seminal work, Habermas established the notion of ‘Public Sphere’ where public opinion gets formed via rational-critical debates . Ruiz et al.  argued that online news media provide a new manifestation of the public sphere – Public Sphere 2.0, where commenting acts as the facilitator of public debates.
However, the myriad plethora of news websites today has resulted in a gradual decline of the attention span of an user to a particular news story. In an earlier work, Nielson  has noted that the readers predominately read online web pages in an F-shaped pattern i.e., two horizontal stripes in the top of the page followed by a vertical stripe along the page. This implies that the attention span of users wanes as they go through an article and most of their attention is focused on the initial paragraphs. In this context, it is important to understand whether the commenting options in news websites today can felicitate discussions on the news stories and play the role of public sphere 2.0.
To investigate this issue, we gather articles and corresponding comments from two popular news websites – The Guardian (theguardian.com) and The New York Times (nytimes.com). We observe that a large number of comments are made targeting particular sections of an article, rather than the entire article itself. Yet, most news media websites allow their readers to comment only on the full article. In this paper, we propose to revamp the commenting UI by automatically placing the most relevant comments against each section of an article. For this, we develop a neural network based mechanism to map comments to particular paragraphs. Extensive evaluations show that our proposed methodology outperforms state-of-the-art baselines. Finally, we build a system which allows a reader to check for comments made against any section of an article and comment on the same. We believe that such system can help news websites in increasing reader engagement further.
2 Dataset and Motivation
In recent years, news media sites have seen huge increase in user engagement through commenting, liking, sharing etc. However, users do not spend similar time over the entire news article. Nielsen  observed that, for news articles, users mostly focus on initial paragraphs or few sentences of a paragraph to consume the summary of an article, possibly due to limited time to read the whole story.
To investigate how this influences the commenting behavior, we gathered news articles from two popular news websites - ‘The Guardian’ and ‘The New York Times’. In total, we collected Guardian and NYTimes news articles encompassing various topics like Business, Technology, Politics, Sports and Editorials and all comments made against these articles222https://tinyurl.com/paragraph2comment.
Figure 0(a) and 0(b) show how the number of comments varies w.r.t. the number of paragraphs and sentences in an article (Y-axis is % distribution). Fig 0(a) points out that more than comments are posted to the articles having more than 20 paragraphs. Fig 0(b) shows how comment distribution varies for 80 sentence threshold (20 paragraphs) for two online news papers. Overall, we see that having more paragraphs in an article increases the number of comments posted against it. Thus, we can conclude that comment-paragraph relation is important.
From the collected articles, we randomly selected articles from each media site for manual annotation, where two annotators were asked to give one of five possible relevance scores for a comment to a paragraph. The relevance scores are 1 (strongly irrelevant), 2(weakly irrelevant), 3(neutral), 4(weakly relevant) and 5(strongly relevant), where the relevance is judged by the presence and absence of common words or a common thought between the paragraph and the comment text. Both annotators provided a relevance score for each paragraph-comment pairs in all articles. Inter-annotator agreement (Cohen ) was . A particular relevance score to a comment-paragraph pair was granted when both the annotators agreed.
We observed that around of the comments (in total) were relevant to the whole article as those were not mapped to a particular paragraph. We consider a comment to be related to the entire article if the comment has a relevance score for at least 3 paragraphs or has a relevance score of for all the paragraphs of the article.
However, approximately half of the comments ( and ) of the Guardian and NYTimes articles are centered towards particular paragraphs as opposed to the entire article. Similar to , we also observe that the mean relevance of a comment decreases along the article’s length. This exemplifies that more relevant comments are related to the beginning paragraphs of an article and such a trend holds true for both Guardian and NYT articles.
Thus it is an interesting problem to find out how comments are related to individual paragraphs rather than the whole article. To automatically find out this association, we created the gold standard annotated datasets of 1834 and 1114 comments for ‘The Guardian’ and ‘New York Times’ respectively. The detailed statistics of the different annotated labels are provided in Table 1. Using this data (after class balancing using the SMOTE  algorithm), we design an automated approach as explained next.
|Relevance Label||% in The Guardian||% in NY Times|
3 Linking comments to paragraphs
In this paper, we propose an approach to correctly identify paragraph-comment pairs and encourage users to comment towards the paragraphs, instead of only commenting on the whole article. Our proposed framework is based on deep neural networks. We have used two different neural network models - Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) where inputs are paragraph and comment vectors. We have used the pre-trained 300 dimension Google News Vectors for each word and in case a pre-trained embedding for a word is not found we take it to 0 (in 300 dimension space). In order to calculate the vector for the entire paragraph and comment, we take the average of all word vectors corresponding to each word in the paragraph and comment respectively. Deep neural network models - (i) LSTM and (ii) GRU were applied on top of the paragraph and comment vectors to get a 150 dimension vector for both paragraph and comment333After experimenting with different dimensions, results (in terms of precision, recall) were best for 150 dimension.
. Thereafter these two vectors were merged and on top of it a fully connected layer with 5 units (for five classes) and soft-max activation is applied to get the probability for each class. The proposed model is shown in Figure1(a)
. No explicit feature extraction, using POS Tagger or LIWC was required for these models.
|The Guardian||New York Times|
Other than neural network models, we have experimented with various traditional machine learning models - Naive Bayes (NB), Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (K-NN), RBF Support Vector Machine (R-SVM), Logistic Regression (LR) and Adaboost. We have extracted different features for these models, which can be grouped into three different categories.
LIWC Features: Total 63 psycholinguistic features were extracted using the LIWC tool .
Others: Uni-gram, bi-gram, tri-gram features for paragraphs and comments.
After generating the feature matrix, dimensions were reduced using Latent Semantic Indexing (LSA) before feeding into the traditional ML-classifiers.
After feature extraction of the annotated datasets, various ML-classifiers were used to calculate 10-fold cross validation tests. For the deep learning model, we have trained for 5 epochs for each step in the 10-fold cross validation. Results are shown in terms of Macro, Micro and Weighted averaged precision and recall for ‘The Guardian’ and ‘New York Times’ datasets444For ML-classifiers, we have computed precision and recall for different combination of (i) POS Tag and Dependency, (ii) LIWC and (iii) Others features but due to space constraint only the best results were shown.. Table 2 shows that LSTM and GRU models outperform ML-classifier models in terms of all metrics and GRU model performs the best. Figure 1(b) shows the snapshot of our model where top k (here k=3) relevant comments are highlighted when the cursor is placed around the second paragraph of a particular story.
To check the effectiveness of our system, we showed to 20 volunteers the same Guardian news stories on the original website and through our system. At the end, the volunteers were asked to rate the interface better for commenting against the articles. 17 out of 20 volunteers gave higher rating to our system interface, and the main reason they cited is the ability to see old comments and post new comments against different portions of the articles.
4 Related Works
Here, we briefly survey the prior works on commenting in online news media.
Comment Ranking: Hsu et al.  developed a regression model for identifying and ranking comments within a Social Web community based on the community’s expressed preferences. Dalal et al.  built Hodge decomposition based rank aggregation technique to rank online comments on the social web.
Comment Recommendation: Bansal et al  proposed ‘Collaborative Correspondence Topic Models’ to recommend comment-worthy blogs or news stories to a particular user (i.e., where she would be interested to leave comments on them), where user feature profile is generated by content analysis. Shmueli et al.  combined content-based approach with a collaborative-filtering approach (utilizing users’ co-commenting patterns) for personalized recommendation of stories to users for discussing through comments. Agarwal et al.  focused on personalized user preference based ranking of the comments in an article.
Comment Analysis: Liu  ranked interest based news sections and articles by using a passage retrieval algorithm. Stroud et al.  analyzed demographics, attitudes and behaviors of user population who comment on different sections. Similar analysis has also been done by Chakraborty et al [16, 17] for social media posts. Mullick et al. [18, 19] classified online comments into opinion and fact and respective subcategories. Mullick et al.  developed opinion-detection algorithm for news articles. Almgren et al.  compared commenting, sharing, tweeting and measured user participation in them. Chakraborty et al [22, 23] utilized these different popularity signals for online news recommendations. Mullick et al.  experimented topic drift event and characteristics in online comments.
Our present work is complementary to these earlier works, where our focus is to explore paragraph oriented commenting pattern and build a model to show relevant comments to a paragraph for felicitating more commenting.
To play the role of the Public Sphere, online news websites need to encourage readers to comment on their articles. In this paper, we argued for a revamp of the traditional commenting interface, and for enabling commenting on selective sections of an article. We developed a deep neural network approach to link comments to particular section. We showed that Gated Recurrent Unit (GRU) model provides best results in terms of macro and micro level precision and recall. Then, we built a basic user interface to increase user engagement in online comment sections. There are few issues to be resolved in our framework - for example, the scenario where a comment belongs to multiple paragraphs, how can a viewer select two non-consecutive paragraphs to read the respective comments and showing scores for comments. Our immediate future step is to develop an end-to-end system after resolving the issues in the model to show a user top K relevant comments (further divided into different sentiment expressed in the comments), while scrolling down the paragraphs. We believe such data driven selective commenting systems can bring more specific and targeted reader engagement for online publishing houses.
-  Park, D., Sachar, S., Diakopoulos, N., Elmqvist, N.: Supporting comment moderators in identifying high quality online news comments. In: ACM CHI. (2016)
-  Habermas, J.: Moral consciousness and communicative action. MIT press (1990)
-  Ruiz, C., Domingo, D., Micó, J.L., Díaz-Noci, J., Meso, K., Masip, P.: Public sphere 2.0? the democratic qualities of citizen debates in online newspapers. Intl. Journ. of Press/Politics 16(4) (2011)
-  Nielsen, J.: Usability 101: Introduction to usability (2003)
-  Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. Journal of AI Research 16 (2002)
Manning, C., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S., McClosky, D.:
The stanford corenlp natural language processing toolkit.In: ACL: Demo. (2014)
-  De Marneffe, M.C., Manning, C.D.: Stanford typed dependencies manual. Technical report, Stanford University (2008)
-  Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of words: Liwc and computerized text analysis methods. Journal of Language and Social Psychology 29(1) (2010)
-  Hsu, C.F., Khabiri, E., Caverlee, J.: Ranking comments on the social web. In: IEEE Computational Science and Engineering. Volume 4. (2009)
-  Dalal, O., Sengemedu, S.H., Sanyal, S.: Multi-objective ranking of comments on web. In: ACM WWW. (2012)
-  Bansal, T., Das, M., Bhattacharyya, C.: Content driven user profiling for comment-worthy recommendations of news and blog articles. In: ACM RecSys. (2015)
-  Shmueli, E., Kagian, A., Koren, Y., Lempel, R.: Care to comment?: recommendations for commenting on news stories. In: ACM WWW. (2012)
-  Agarwal, D., Chen, B.C., Pang, B.: Personalized recommendation of user comments via factor models. In: ACL EMNLP. (2011)
-  Liu, X.: Comment centric news analysis for ranking. Proceedings of the American Society for Information Science and Technology 46(1) (2009)
-  Stroud, N.J., Van Duyn, E., Peacock, C.: News commenters and news comment readers. Engaging News Project (2016)
-  Chakraborty, A., Sarkar, R., Mrigen, A., Ganguly, N.: Tabloids in the era of social media? understanding the production and consumption of clickbaits in twitter. Proc. ACM Hum.-Comput. Interact. 1(CSCW) (2017)
-  Chakraborty, A., Messias, J., Benevenuto, F., Ghosh, S., Ganguly, N., Gummadi, K.P.: Who makes trends? understanding demographic biases in crowdsourced recommendations. In: AAAI ICWSM. (2017)
-  Mullick, A., Maheshwari, S., Goyal, P., Ganguly, N., et al.: A generic opinion-fact classifier with application in understanding opinionatedness in various news section. In: WWW Companion. (2017)
-  Mullick, A., Ghosh D, S., Maheswari, S., Sahoo, S., Maity, S.K., Goyal, P., et al.: Identifying opinion and fact subcategories from the social web. In: ACM GROUP. (2018)
-  Mullick, A., Goyal, P., Ganguly, N.: A graphical framework to detect and categorize diverse opinions from online news. In: PEOPLES. (2016)
-  Almgren, S.M., Olsson, T.: Commenting, sharing and tweeting news. Nordicom Review 37(2) (2016) 67–81
-  Chakraborty, A., Ghosh, S., Ganguly, N., Gummadi, K.P.: Optimizing the recency-relevancy trade-off in online news recommendations. In: WWW. (2017)
-  Chakraborty, A., Patro, G.K., Ganguly, N., Gummadi, K.P., Loiseau, P.: Equality of voice: Towards fair representation in crowdsourced top-k recommendations. In: ACM FAT*. (2019)
-  Mullick, A., Bhandari, A., Niranjan, A., Sckhar, N., Garg, S., Bubna, R., Roy, M.: Drift in online social media. (11 2018) 302–307