Log In Sign Up

Topic Detection and Summarization of User Reviews

A massive amount of reviews are generated daily from various platforms. It is impossible for people to read through tons of reviews and to obtain useful information. Automatic summarizing customer reviews thus is important for identifying and extracting the essential information to help users to obtain the gist of the data. However, as customer reviews are typically short, informal, and multifaceted, it is extremely challenging to generate topic-wise summarization.While there are several studies aims to solve this issue, they are heuristic methods that are developed only utilizing customer reviews. Unlike existing method, we propose an effective new summarization method by analyzing both reviews and summaries.To do that, we first segment reviews and summaries into individual sentiments. As the sentiments are typically short, we combine sentiments talking about the same aspect into a single document and apply topic modeling method to identify hidden topics among customer reviews and summaries. Sentiment analysis is employed to distinguish positive and negative opinions among each detected topic. A classifier is also introduced to distinguish the writing pattern of summaries and that of customer reviews. Finally, sentiments are selected to generate the summarization based on their topic relevance, sentiment analysis score and the writing pattern. To test our method, a new dataset comprising product reviews and summaries about 1028 products are collected from Amazon and CNET. Experimental results show the effectiveness of our method compared with other methods.


page 1

page 2

page 3

page 4


Noisy Pairing and Partial Supervision for Opinion Summarization

Current opinion summarization systems simply generate summaries reflecti...

MRCBert: A Machine Reading ComprehensionApproach for Unsupervised Summarization

When making an online purchase, it becomes important for the customer to...

Regional Topics in British Grocery Retail Transactions

Understanding the customer behaviours behind transactional data has high...

Confirmatory Aspect-based Opinion Mining Processes

A new opinion extraction method is proposed to summarize unstructured, u...

Every Bite Is an Experience: Key Point Analysis of Business Reviews

Previous work on review summarization focused on measuring the sentiment...

Extractive Summarization of Call Transcripts

Text summarization is the process of extracting the most important infor...

Viscovery: Trend Tracking in Opinion Forums based on Dynamic Topic Models

Opinions in forums and social networks are released by millions of peopl...


The number of customer reviews from various platforms grows rapidly nowadays. It is impossible for people to read through tons of reviews and to obtain useful information. Automatic summarizing customer reviews thus is important for identifying and extracting the essential information to help users to understand or to get the gist of the data. By providing a summarization of previous product reviews, customers can easily understand the features of products and sellers can learn the actual needs from customers’ feedback. Many studies have been focused on single document summarization and shown promising results for summarizing news articles 

[1, 2], emails [3, 4], product titles [5] etc. However, summarization methods for single (general) document are not applicable for customer reviews that are multiple documents written by various customers. Moreover, as customer reviews are typically short, informal text containing information about multiple aspects of products, it is hard to find the hidden topics among customer reviews and summarize them.

Gernerally, summarization methods can be classified into two categories: abstractive and extractive methods [6, 7]. Abstractive summarization aims to generate a short text summary by paraphrasing content of the original document. Several sentence compression methods have been proposed to comprise original sentences to create a summary by using a syntactic parser or a word graph [9, 8, 10]

. However, it remains a difficult task due to the abstractive method typically involves many sophisticated nature language techniques such as meaning representation, content organization, sentence compression, paraphrasing etc. As such, the quality of the generated summary from an abstractive system is hard to control and present. Recently, many studies utilize neural networks that are based on encoder-decoder architecture for abstractive summarization 

[11, 12, 13]. Still, the quality of the generated summary is the major concern, especially when it applies to customer reviews which contains noise, ungrammatical documents, and conflicting opinions.

Much more efforts have been focused on extractive summarization that aims to select salient parts of the original document such as sentence parts or whole sentences as the summarization of documents. As such, topic models and clustering method were introduced to find the documents that talk about the similar content/topic. Statistical features such as the position of sentences, positive and negative words, sentence length, etc. are used to select important sentences and words from the source text [14, 15]. The encoder-decoder and attention mechanism also be applied for extractive summarization recently [16]. However, all above methods focus on summarizaion of news articles, or emails, etc. There are still very limited number of studies focus on summarization of customer reviews [17, 18, 20, 19].

Zhan et al. (2009), proposed an extractive summarizaiton method that is based on analysis of internal topic structure of product reviews and tested it on reviews collected from 8 products [17]. Yu et al. (2016) selected important sentences by analyzing their popularity and specificity [18]. For the methods proposed by Tan et al. (2017) and Amplayo et al. (2017), topic modeling methods are used [20, 19]. While thousand of reviews are used in above methods, they are heuristic methods due to the lack of groundtruth summaries. Summaries for less than 10 products or only the positive/negative rate are used to test their methods. As such, the critical source of summaries are missing for above methods.

Another work that could also be relevant to our work is opinion extraction from customer reviews. The opinion extraction methods differ from general customer review summarization as it focuses on summarizing selected sentences that only relevant to a manually designed topic. Hu et al. (2006) examined the review sentences and designed particular rules to detect product features among the source data and generate the summarization [21]. Ganesan et al. (2010) proposed a graph-based framework for generating summaries from review sentences collected by using 51 queries [22]. Hu et al. (2017) proposed a sentence importance metric that is based on content and sentiment similarities for selecting important sentences [23]. Similarly, these methods are designed to learn opinions from only customer reviews rather than from both reviews and summaries.

Here we present a new topic modeling based summarization method with following main contributions. Firstly, we created a new Amazon-Cnet dataset with mapping between Amazon reviews and Cnet summary. Secondly, we provide a unified framework to segment review, cluster review sentiments into single document, model the review topics,and generate the summarization. Lastly, the experimental results and evaluation provide convincing evidence that the proposed method can be a useful tool for review summarization.

The rest of the article is organized as follows: Section 2 describes the complete framework of our method and detailed steps; Experimental settings, results and performance evaluation are presented and discussed in Section 3; followed by conclusion and future work in Section 4.


Our goal is to build a summary generator by analyzing both reviews and summaries. We first preprocess the raw text and segment reviews or summaries into individual sentiments where each of them contains information about only one aspect of product. As the sentiments are typically short, we thus combine sentiments talking about the same topic/aspect into a single document and apply topic modeling method to identify hidden topics among customer reviews and summaries. Next, we apply sentiment analysis method to those sentiments that belong to the same topic for distinguishing positive and negative sentiments. To generate the final summarization, a classifier also is introduced to distinguish the writing pattern of summaries and that of customer reviews. Finally, sentiments are selected to generate the summarization based on their topic probability, sentiment analysis score and writing pattern. The complete framework for our approach is shown in Fig. 1. The rest of this section provides details of each step.

Figure 1: Our framework for customer review summarization.

Text preprocessing

A customer review typically contains information about multiple aspects of a product. To obtain the information about individual aspect, we first break a review into sentences. As such, a customer review, review , is converted into a set of sentences . Similarly, we also split a summary, summary , into individual sentences .

We note that customer review are written in informal and concise phrases. As such, the majority of sentences after parsing are very short (less than 8 words length). It is hard to learn the topic and sentiments from short documents directly. Notably, sentences contains the same noun. typically talking about the same aspect of the product. For example, ’The battery last one day long.’ and ‘It is pretty heavy due to the battery.’ all talk about ‘battery’ which appears in both sentences. We thus combine sentences that contain the same noun to create a longer document, , for further process. As a sentence may contain more than one noun, such sentence will appear in multiple combined documents. For the sentences that do not have any nouns, they will not be included in any combined documents.

For each sentence and combined document, the traditional preprocessing steps are also employed. We first substitute all contractions and specific terms, such as, e-mail, sd-card based on a manually build dictionary. For example, the word ‘e-mail’ will be converted to email. We also remove stop words [25] and standard suffixes using Porter stemmer [24].

Figure 2: Summarization examples generated by using our method.

Topic modeling on reviews and summaries

As we mentioned before, customers reviews typically contain information about multiple aspects of products. Moreover, due to different user experience, the aspects reviewed by different customer could be also vary. Thus, identifying the hidden topics among customer reviews is important to generate the summarization. Regarding the summaries, as they are written by people typically with expertise and focus on only product itself. The topics covered by customer reviews are quite different from that by summaries. For example, the shipping experience which is an important topic among customer reviews, typically will not be mentioned in Cnet summaries. We thus identify the hidden topics among both customer reviews and summaries.

In this work, we use LDA to identify hidden topics among combined documents of customer reviews and of summaries, which is a generative statistical model that has been widely used for topic modeling. To do that, we use the implementation from scikit-learn [26]. The model parameters are learnt iteratively for different number of topics, , where ranges from 5 to 40, and the log-likelihood and perplexity are calculated for each value of . To determine the optimal number of topics, we identify the value that maximizes log-likelihood and minimize the perplexity. Two models LDAreview and LDAsummary are trained based on Amazon customer reviews and Cnet summaries respectively.

To identify the review sentences that talk about the same topic, we predict the topic label for each review sentence using LDAreview and LDAsummary, respectively. We note review sentences after the parsing step may be too short to be classified. Therefore, the sentence which obtains all zero prediction for all topics will be discard. Review sentences then will be grouped into sets of topics along with their probability scores, , indicating how likely the sentence belongs to the particular topic .

Topic understanding and sentiment analysis

While we have split review sentences into sets of topics, many sentences that belongs to the same topic may express conflicting opinions. For example, sentences ’The screen has great resolution.’ and ‘I hope I bought larger screen’ could be assigned with the same topic label, while they totally express opposite opinions. To identify the hidden opinions among each topic, we apply sentiment analysis to sentences belonging to the same set.

To do sentiment analysis, we employ VADER (Valence Aware Dictionary for sEntiment Reasoning) which is a rule-based model utilizing lexical features and rules that embody grammatical and syntactical conventions [27]. As VADER is build upon analyzing social media text snippets collected from twitters that have similar writing patterns with review data, we believe it is well-fitted for review sentiment analysis.

Thus, given a review sentence , we can obtain a positive sentiment score, , and a negative sentiment score, neg by using VADER. The opinion of sentence then can be determined by the label of the maximum value of and .

Summary generation

Customers typically talk about multiple aspects of the product in their reviews. To generate the final summary, we first identify the most salient topics that are covered by customer reviews. In our work, the is set to 5. To do that, we check the number of sentences that in each topic set and pick the most salient topics for generating the summary.

As we mentioned, conflicting opinions could appear in the same topic. To represent the overall opinion of a topic, we also select the most popular attitude in the most salient topics. For example, when the number of positive sentences is higher than that of sentences labeled as negative, we believe the topic is positive and the opinion score for each sentence obtained by sentiment analysis, . Otherwise, the topic is negative and the opinion score for each sentence, .

Notably, summaries are typically written in a different style from customer reviews. Therefore, the writing style is one of the most important factors for ranking sentences. Unlike existing work that generate the summary just based on the analysis of reviews. We build a classifier to distinguish writing patterns for summaries and that for customer reviews. To do that, we create a set of summary sentences and a set of review sentences. We note the imbalance issue between the summary sentence set and the review sentence set. We use meta learning [28] where the majority class is split into multiple subsets, each of which is of similar size to the minority class, to train a base-classifier. The final classifier is build upon the decision made from all base classifiers. By using this classifier, we can obtain a summary likelihood for each sentence, .

The summarization of customer reviews is a set of most important/informative/representative sentences that are selected from the most salient topics. All above factors are very critical to determine the importance of the sentence:

  • The probability that the sentence belong to current topic:

  • The opinion score obtained from sentiment analysis: .

  • The summary likelihood: .

To select the most important sentence within each topic sentence set, we calculate the importance score for each sentence as follows:

After ranking the sentences within each topic, sentences with the highest score within their corresponding topics are selected as the final summary.

Experiments and results

Experimental settings

To evaluate our method, we compare the performance of our system to that of two state-of-the-art systems, TextRank [31], Opinosis [22], Biclique [32], ILPSumm [33], and ParaFuse_doc [12]. Automatic evaluation measures like ROUGE and its modifications [30] are used to evaluate our performance.


As there is no relevant dataset available online, we build our own, Amazon-Cnet dataset. To do that, we first select 2000 cell phone products that associated with more than 10 customer reviews at random from the Cell Phones and Accessories category in Amazon Review Dataset [29]. Cnet is one of the most popular website providing professional reviews for electronics, such as cell phones. We thus manually crawl the summary from Cnet webpage as ground-truth summaries for those cell phones. At last, 1028 products have their Cnet review webpage and be included in Amazon-Cnet datsset. Table 1 provides statistics of our datasets.

The number # of sentences
Summaries 1028 1385
Reviews 66129 362965
Table 1: Statistics for our Amazon-Cnet dataset used in our experiments. The summaries and customers reviews are for 1028 products.


Our method attained 15.43% over the Amazon-Cnet dataset. We note that the ROUGE-1 score obtained by our method is still much lower than other results reported over datasets like Opinions, given Amazon-Cnet datsset is more challenging and practical for real-world usage. By looking at the dataset, we found that the reviews typically talk about details of a product, such as the resolution of the screens, the loudness of the speakerphone. In contrast, the summaries typically talk about the high level characteristics of the product, such as the smooth of the mobile system. We also found that the topics within customer reviews usually are not interests of summarization. For example, the customer service, which is a hot topic among reviews, is not an interest for summarization. Unlike other methods generate summarization by analyze the reviews only, our method also consider the topics among summaries therefore out-performance all other methods. Figure 2 shows two summarization examples that are generated by using our method.

This is our first preliminary work to integrate the summary information for customer review summarizations. As such, there is still much room for improvements. We are going to collect more online review and sumamrization pairs to train a more comprehensive model. We also note that the user reviews of a product could span a wide range of time period. However, the summary of a product is typically posted at very early stage when the product been released. For example, the Cnet summary for product ’ZTE Merit’ is posted on Aug 2012, and the latest customer review is posted on Mar 2016. We can believe that the opinion reported 4 years after the first product release may not be helpful for hte summariaation. As such, we are going to also investigate the time series issue among customer reviews for creating their summarization.


We present a new method for customer review summarization utilizing both customer reviews and summaries. Unlike other methods that only consider customer reviews, we identify hidden topics among both customer reviews and summaries. Sentiment analysis is employed to distinguish positive and negative opinions among each detected topic. A classifier is also introduced to distinguish the writing pattern of summaries and that of customer reviews. Finally, sentiments are selected to generate the summarization based on their topic relevance, sentiment analysis score and the writing pattern. A new dataset comprising product reviews and summaries about 1028 products are collected from Amazon and CNET. Experimental results show the effectiveness of our method.

Sevearl challenging issues remain as future work. We are going to collect more data to train a more comprehensive model for review summarization. We shall also investigate the time series among customer reviews for generating summarization.


  • [1] Baralis, E., Cagliero, L., Mahoto, N. and Fiori, A., 2013. GRAPHSUM: Discovering correlations among multiple terms for graph-based summarization. Information Sciences, 249, pp.96-109.
  • [2] Wei, Z. and Gao, W., 2015, August. Gibberish, assistant, or master?: Using tweets linking to news for extractive single-document summarization. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1003-1006).
  • [3] Carenini, G., Ng, R.T. and Zhou, X., 2008, June. Summarizing emails with conversational cohesion and subjectivity. In Proceedings of ACL-08: HLT (pp. 353-361).
  • [4] Paulus, R., Xiong, C. and Socher, R., 2017. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304.
  • [5] Sun, F., Jiang, P., Sun, H., Pei, C., Ou, W. and Wang, X., 2018, October. Multi-source pointer network for product title summarization. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management (pp. 7-16). ACM.
  • [6]

    Gambhir, M. and Gupta, V., 2017. Recent automatic text summarization techniques: a survey. Artificial Intelligence Review, 47(1), pp.1-66.

  • [7] Pecar, S., 2018, July. Towards opinion summarization of customer reviews. In Proceedings of ACL 2018, Student Research Workshop (pp. 1-8).
  • [8] Filippova, K., 2010, August. Multi-sentence compression: Finding shortest paths in word graphs. In Proceedings of the 23rd International Conference on Computational Linguistics (pp. 322-330). Association for Computational Linguistics.
  • [9]

    Genest, P.E. and Lapalme, G., 2010, November. Text Generation for Abstractive Summarization. In TAC.

  • [10] Khan, A., Salim, N. and Kumar, Y.J., 2015. A framework for multi-document abstractive summarization based on semantic role labelling. Applied Soft Computing, 30, pp.737-747.
  • [11] Rush, A.M., Chopra, S. and Weston, J., 2015. A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685.
  • [12] Nayeem, M.T., Fuad, T.A. and Chali, Y., 2018, August. Abstractive unsupervised multi-document summarization using paraphrastic sentence fusion. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 1191-1204).
  • [13] Cao, Z., Li, W., Li, S. and Wei, F., 2018, July. Retrieve, rerank and rewrite: Soft template based neural summarization. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 152-161).
  • [14] Fattah, M.A. and Ren, F., 2009. GA, MR, FFNN, PNN and GMM based models for automatic text summarization. Computer Speech & Language, 23(1), pp.126-144.
  • [15]

    Abuobieda, A., Salim, N., Albaham, A.T., Osman, A.H. and Kumar, Y.J., 2012, March. Text summarization features selection method using pseudo genetic-based model. In 2012 International Conference on Information Retrieval & Knowledge Management (pp. 193-197). IEEE.

  • [16]

    Nallapati, R., Zhai, F. and Zhou, B., 2017, February. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Thirty-First AAAI Conference on Artificial Intelligence.

  • [17] Zhan, J., Loh, H.T. and Liu, Y., 2009. Gather customer concerns from online product reviews–A text summarization approach. Expert Systems with Applications, 36(2), pp.2107-2115.
  • [18] Yu, N., Huang, M. and Shi, Y., 2016. Product review summarization by exploiting phrase properties. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers (pp. 1113-1124).
  • [19] Tan, J., Kotov, A., Pir Mohammadiani, R. and Huo, Y., 2017, November. Sentence retrieval with sentiment-specific topical anchoring for review summarization. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (pp. 2323-2326). ACM.
  • [20]

    Amplayo, R.K. and Song, M., 2017. An adaptable fine-grained sentiment analysis for summarization of multiple short online reviews. Data & Knowledge Engineering, 110, pp.54-67.

  • [21] Hu, M. and Liu, B., 2006, July. Opinion extraction and summarization on the web. In AAAI (Vol. 7, pp. 1621-1624).
  • [22] Ganesan, K., Zhai, C. and Han, J., 2010, August. Opinosis: A graph based approach to abstractive summarization of highly redundant opinions. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010) (pp. 340-348).
  • [23] Hu, Y.H., Chen, Y.L. and Chou, H.L., 2017. Opinion mining from online hotel reviews–a text summarization approach. Information Processing & Management, 53(2), pp.436-449.
  • [24] Porter, M.F., 1980. An algorithm for suffix stripping. Program, 14(3), pp.130-137.
  • [25]

    Nothman, J., Qin, H. and Yurchak, R., 2018, July. Stop Word Lists in Free Open-source Software Packages. In Proceedings of Workshop for NLP Open Source Software (NLP-OSS) (pp. 7-12).

  • [26] Hoffman, M., Bach, F.R. and Blei, D.M., 2010. Online learning for latent dirichlet allocation. In advances in neural information processing systems (pp. 856-864).
  • [27] Hutto, C.J. and Gilbert, E., 2014, May. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Eighth international AAAI conference on weblogs and social media.
  • [28] Chan, P.K. and Stolfo, S.J., 1998, August. Toward Scalable Learning with Non-Uniform Class and Cost Distributions: A Case Study in Credit Card Fraud Detection. In KDD (Vol. 1998, pp. 164-168).
  • [29] He, R. and McAuley, J., 2016, April. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In proceedings of the 25th international conference on world wide web (pp. 507-517). International World Wide Web Conferences Steering Committee.
  • [30] Lin, C.Y., 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81).
  • [31]

    Mihalcea, R. and Tarau, P., 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404-411).

  • [32] Muhammad, A.S., Damaschke, P. and Mogren, O., 2016. Summarizing online user reviews using bicliques. In International Conference on Current Trends in Theory and Practice of Informatics (pp. 569-579).
  • [33] Banerjee, S., Mitra, P. and Sugiyama, K., 2015. Multi-document abstractive summarization using ilp based multi-sentence compression. In Twenty-Fourth International Joint Conference on Artificial Intelligence.