Exploiting Deep Learning for Persian Sentiment Analysis

08/15/2018 ∙ by Kia Dashtipour, et al. ∙ University of Stirling 0

The rise of social media is enabling people to freely express their opinions about products and services. The aim of sentiment analysis is to automatically determine subject's sentiment (e.g., positive, negative, or neutral) towards a particular aspect such as topic, product, movie, news etc. Deep learning has recently emerged as a powerful machine learning technique to tackle a growing demand of accurate sentiment analysis. However, limited work has been conducted to apply deep learning algorithms to languages other than English, such as Persian. In this work, two deep learning models (deep autoencoders and deep convolutional neural networks (CNNs)) are developed and applied to a novel Persian movie reviews dataset. The proposed deep learning models are analyzed and compared with the state-of-the-art shallow multilayer perceptron (MLP) based machine learning model. Simulation results demonstrate the enhanced performance of deep learning over state-of-the-art MLP.



There are no comments yet.


This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In recent years, social media, forums, blogs and other forms of online communication tools have radically affected everyday life, especially how people express their opinions and comments. The extraction of useful information (such as people’s opinion about companies brand) from the huge amount of unstructured data is vital for most companies and organizations[5]

. The product reviews are important for business owners as they can take business decision accordingly to automatically classify user’s opinions towards products and services. The application of sentiment analysis is not limited to product or movie reviews but can be applied to different fields such as news, politics, sport etc. For example, in online political debates, the sentiment analysis can be used to identify people’s opinions on a certain election candidate or political parties

[27][19][20]. In this context, sentiment analysis has been widely used in different languages by using traditional and advanced machine learning techniques. However, limited research has been conducted to develop models for the Persian language.

The sentiment analysis is a method to automatically process large amounts of data and classify text into positive or negative sentiments) [2] [8]. Sentiment analysis can be performed at two levels: at the document level or at the sentence level. At document level it is used to classify the sentiment expressed in the document (positive or negative), whereas, at sentence level is used to identify the sentiments expressed only in the sentence under analysis [7] [6].

In the literature, deep learning based automated feature extraction has been shown to outperform state-of-the-art manual feature engineering based classifiers such as Support Vector Machine (SVM), Naive Bayes (NB) or Multilayer Perceptron (MLP) etc. One of the important techniques in deep learning is the autoencoder that generally involves reducing the number of feature dimensions under consideration. The aim of dimensionality reduction is to obtain a set of principal variables to improve the performance of the approach. Similarly, CNNs have been proven to be very effective in sentiment analysis. However, little work has been carried out to exploit deep learning based feature representation for Persian sentiment analysis

[16] [10]. In this paper, we present two deep learning models (deep autoencoders and CNNs) for Persian sentiment analysis. The obtained deep learning results are compared with MLP.

The rest of the paper is organized as follows: Section 2 presents related work. Section 3 presents methodology and experimental results. Finally, section 4 concludes this paper.

2 Related Works

In the literature, extensive research has been carried out to model novel sentiment analysis models using both shallow and deep learning algorithms. For example, the authors in [3] proposed a novel deep learning approach for polarity detection in product reviews. The authors addressed two major limitations of stacked denoising of autoencoders, high computational cost and the lack of scalability of high dimensional features. Their experimental results showed the effectiveness of proposed autoencoders in achieving accuracy upto 87%. Zhai et al., [28]

proposed a five layers autoencoder for learning the specific representation of textual data. The autoencoders are generalised using loss function and derived discriminative loss function from label information. The experimental results showed that the model outperformed bag of words, denoising autoencoders and other traditional methods, achieving accuracy rate up to 85% . Sun et al.,

[26] proposed a novel method to extract contextual information from text using a convolutional autoencoder architecture. The experimental results showed that the proposed model outperformed traditional SVM and Nave Bayes models, reporting accuracy of 83.1 %, 63.9% and 67.8% respectively.

Su et al., [24] proposed an approach for a neural generative autoencoder for learning bilingual word embedding. The experimental results showed the effectiveness of their approach on English-Chinese, English-German, English-French and English-Spanish (75.36% accuracy). Kim et al., [14]

proposed a method to capture the non-linear structure of data using CNN classifier. The experimental results showed the effectiveness of the method on the multi-domain dataset (movie reviews and product reviews). However, the disadvantage is only SVM and Naive Bayes classifiers are used to evaluate the performance of the method and deep learning classifiers are not exploited. Zhang et al.,

[29] proposed an approach using deep learning classifiers to detect polarity in Japanese movie reviews. The approach used denoising autoencoder and adapted to other domains such as product reviews. The advantage of the approach is not depended on any language and could be used for various languages by applying different datasets. AP et al., [1] proposed a CNN based model for cross-language learning of vectorial word representations that is coherent between two languages. The method is evaluated using English and German movie reviews dataset. The experimental results showed CNN (83.45% accuracy) outperformed as compared to SVM (65.25% accuracy).

Zhou et al., [30] proposed an autoencoder architecture constituting an LSTM-encoder and decoder in order to capture features in the text and reduce dimensionality of data. The LSTM encoder used the interactive scheme to go through the sequence of sentences and LSTM decoder reconstructed the vector of sentences. The model is evaluated using different datasets such as book reviews, DVD reviews, and music reviews, acquiring accuracy up to 81.05%, 81.06%, and 79.40% respectively. Mesnil et al., [17] proposed an approach using ensemble classification to detect polarity in the movie reviews. The authors combined several machine learning algorithms such as SVM, Naive Bayes and RNN to achieve better results, where autoencoders were used to reduce the dimensionality of features. The experimental results showed the combination of unigram, bigram and trigram features (91.87% accuracy) outperformed unigram (91.56% accuracy) and bigram (88.61% accuracy).

Scheible et al., [22] trained an approach using semi-supervised recursive autoencoder to detect polarity in movie reviews dataset, consisted of 5000 positive and 5000 negative sentiments. The experimental results demonstrated that the proposed approach successfully detected polarity in movie reviews dataset (83.13% accuracy) and outperformed standard SVM (68.36% accuracy) model. Dai et al., [4] developed an autoencoder to detect polarity in the text using deep learning classifier. The LSTM was trained on IMDB movie reviews dataset. The experimental results showed the outperformance of their proposed approach over SVM. In table 1 some of the autoencoder approaches are depicted.

3 Methodology and Experimental Results

The novel dataset used in this work was collected manually and includes Persian movie reviews from 2014 to 2016. A subset of dataset was used to train the neural network (60% training dataset) and rest of the data (40%) was used to test and validate the performance of the trained neural network (testing set (30%), validation set (10%)). There are two types of labels in the dataset: positive or negative. The reviews were manually annotated by three native Persian speakers aged between 30 and 50 years old.

After data collection, the corpus was pre-processed using tokenisation, normalisation and stemming techniques. The process of converting sentences into single word or token is called tokenisation. For example, ”The movie is great” is changed to ”The”, ”movie”, ”is”, ”great” [25]. There are some words which contain numbers. For example, ”great” is written as ”gr8” or ”gooood” as written as ”good” . The normalisation is used to convert these words into normal forms [21]. The process of converting words into their root is called stemming. For example, going was changed to go [15]. Words were converted into vectors. The fasttext was used to convert each word into 300-dimensions vectors. Fasttext is a library for text classification and representation [13][18][10].

For classification, MLP, autoencoders and CNNs have been used. Fig. 1. depicts the modelled MLP architectures. MLP classifer was trained for 100 iterations [9]

. Fig. 2. depicts the modelled autoencoder architecture. Autoencoder is a feed-forward deep neural network with unsupervised learning and it is used for dimensionality reduction. The autoencoder consists of input, output and hidden layers. Autoencoder is used to compress the input into a latent-space and then the output is reconstructed

[23] [11] [12]. The exploited autoencoder model is depcited in Fig. 1. The autoencoder consists of one input layer three hidden layers (1500, 512, 1500) and an output layer. Convolutional Neural Networks contains three layers (input, hidden and output layer). The hidden layer consists of convolutional layers, pooling layers, fully connected layers and normalisation layer. The

is denotes the hidden neurons of j, with bias of

, is a weight sum over continuous visible nodes v which is given by:


The modelled CNN architecture is depicted in Fig. 3 [12][11]

. For CNN modelling, each utterance was represented as a concatenation vector of constituent words. The network has total 11 layers: 4 convolution layers, 4 max pooling and 3 fully connected layers. Convolution layers have filters of size 2 and with 15 feature maps. Each convolution layer is followed by a max polling layer with window size 2. The last max pooling layer is followed by fully connected layers of size 5000, 500 and 4. For final layer, softmax activation is used.

Figure 1: Multilayer Perceptron
Figure 2: Autoencoder
Figure 3: Deep Convolutional Neural Network

To evaluate the performance of the proposed approach, precision (1), recall (2), f-Measure (3), and prediction accuracy (4) have been used as a performance matrices. The experimental results are shown in Table 1, where it can be seen that autoencoders outperformed MLP and CNN outperformed autoencoders with the highest achieved accuracy of 82.6%.


where TP is denotes true positive, TN is true negative, FP is false positive, and FN is false negative.

Precision Recall F-measure Accuracy (%)
Negative 0.78 0.76 0.77
Positive 0.79 0.81 0.8
AVG 0.78 0.78 0.78 78.49
Precision Recall F-measure Accuracy (%)
Negative 0.78 0.81 0.79
Positive 0.82 0.8 0.81
AVG 0.8 0.8 0.8 80.08
Precision Recall F-measure Accuracy (%)
Negative 0.90 0.78 0.83
Positive 0.77 0.89 0.82
AVG 0.84 0.83 0.83 82.86
Table 1: Results: MLP vs. Autoencoder vs. Convolutional Neural Network

4 Conclusion

Sentiment analysis has been used extensively for a wide of range of real-world applications, ranging from product reviews, surveys feedback, to business intelligence, and operational improvements. However, the majority of research efforts are devoted to English-language only, where information of great importance is also available in other languages. In this work, we focus on developing sentiment analysis models for Persian language, specifically for Persian movie reviews. Two deep learning models (deep autoencoders and deep CNNs) are developed and compared with the the state-of-the-art shallow MLP based machine learning model. Simulations results revealed the outperformance of our proposed CNN model over autoencoders and MLP. In future, we intend to exploit more advanced deep learning models such as Long Short-Term Memory (LSTM) and LSTM-CNNs to further evaluate the performance of our developed novel Persian dataset.

5 Acknowledgment

Amir Hussain and Ahsan Adeel were supported by the UK Engineering and Physical Sciences Research Council (EPSRC) grant No.EP/M026981/1.


  • [1] AP, S.C., Lauly, S., Larochelle, H., Khapra, M., Ravindran, B., Raykar, V.C., Saha, A.: An autoencoder approach to learning bilingual word representations. In: Advances in Neural Information Processing Systems. pp. 1853–1861 (2014)
  • [2] Cambria, E., Poria, S., Hazarika, D., Kwok, K.: Senticnet 5: discovering conceptual primitives for sentiment analysis by means of context embeddings. In: AAAI (2018)
  • [3] Chen, M., Xu, Z., Weinberger, K., Sha, F.: Marginalized denoising autoencoders for domain adaptation. arXiv preprint arXiv:1206.4683 (2012)
  • [4] Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: Advances in Neural Information Processing Systems. pp. 3079–3087 (2015)
  • [5] Dashtipour, K., Gogate, M., Adeel, A., Algarafi, A., Durrani, T., Hussain, A.: Comparative study of persian sentiment analysis based on different feature combinations. In: Communications, Signal Processing, and Systems. CSPS 2017. Lecture Notes in Electrical Engineering, vol 463. pp. 2288–2294. Springer (2017)
  • [6]

    Dashtipour, K., Gogate, M., Adeel, A., Algarafi, A., Howard, N., Hussain, A.: Persian named entity recognition. In: Cognitive Informatics & Cognitive Computing (ICCI* CC), 2017 IEEE 16th International Conference on. pp. 79–83. IEEE (2017)

  • [7]

    Dashtipour, K., Hussain, A., Zhou, Q., Gelbukh, A., Hawalah, A.Y., Cambria, E.: Persent: a freely available persian sentiment lexicon. In: International Conference on Brain Inspired Cognitive Systems. pp. 310–320. Springer (2016)

  • [8] Dashtipour, K., Poria, S., Hussain, A., Cambria, E., Hawalah, A.Y., Gelbukh, A., Zhou, Q.: Multilingual sentiment analysis: state of the art and independent comparison of techniques. Cognitive computation 8(4), 757–771 (2016)
  • [9] Gardner, M.W., Dorling, S.: Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment 32(14-15), 2627–2636 (1998)
  • [10] Gasparini, S., Campolo, M., Ieracitano, C., Mammone, N., Ferlazzo, E., Sueri, C., Tripodi, G.G., Aguglia, U., Morabito, F.C.: Information theoretic-based interpretation of a deep neural network approach in diagnosing psychogenic non-epileptic seizures. Entropy 20(2),  43 (2018)
  • [11] Gogate, M., Adeel, A., Hussain, A.: Deep learning driven multimodal fusion for automated deception detection. In: Computational Intelligence (SSCI), 2017 IEEE Symposium Series on. pp. 1–6. IEEE (2017)
  • [12] Gogate, M., Adeel, A., Hussain, A.: A novel brain-inspired compression-based optimised multimodal fusion for emotion recognition. In: Computational Intelligence (SSCI), 2017 IEEE Symposium Series on. pp. 1–7. IEEE (2017)
  • [13] Joulin, A., Grave, E., Bojanowski, P., Douze, M., Jégou, H., Mikolov, T.: Fasttext.zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016)
  • [14] Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
  • [15] Korenius, T., Laurikkala, J., Järvelin, K., Juhola, M.: Stemming and lemmatization in the clustering of finnish text documents. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management. pp. 625–633. ACM (2004)
  • [16] LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. nature 521(7553),  436 (2015)
  • [17] Mesnil, G., Mikolov, T., Ranzato, M., Bengio, Y.: Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews. arXiv preprint arXiv:1412.5335 (2014)
  • [18] Morabito, F.C., Campolo, M., Ieracitano, C., Ebadi, J.M., Bonanno, L., Bramanti, A., Desalvo, S., Mammone, N., Bramanti, P.: Deep convolutional neural networks for classification of mild cognitive impaired and alzheimer’s disease patients from scalp eeg recordings. In: Research and Technologies for Society and Industry Leveraging a better tomorrow (RTSI), 2016 IEEE 2nd International Forum on. pp. 1–6. IEEE (2016)
  • [19] Ren, J., Jiang, J.: Hierarchical modeling and adaptive clustering for real-time summarization of rush videos. IEEE Transactions on Multimedia 11(5), 906–917 (2009)
  • [20] Ren, J., Jiang, J., Feng, Y.: Activity-driven content adaptation for effective video summarization. Journal of Visual Communication and Image Representation 21(8), 930–938 (2010)
  • [21] Reynolds, D.A.: Comparison of background normalization methods for text-independent speaker verification. In: Fifth European Conference on Speech Communication and Technology (1997)
  • [22] Scheible, C., Schütze, H.: Cutting recursive autoencoder trees. arXiv preprint arXiv:1301.2811 (2013)
  • [23] Semeniuta, S., Severyn, A., Barth, E.: A hybrid convolutional variational autoencoder for text generation. arXiv preprint arXiv:1702.02390 (2017)
  • [24] Su, J., Wu, S., Zhang, B., Wu, C., Qin, Y., Xiong, D.: A neural generative autoencoder for bilingual word embeddings. Information Sciences 424, 287–300 (2018)
  • [25] Sumathy, K., Chidambaram, M.: Text mining: concepts, applications, tools and issues-an overview. International Journal of Computer Applications 80(4) (2013)
  • [26] Sun, X., Li, C., Ren, F.: Sentiment analysis for chinese microblog based on deep neural networks with convolutional extension features. Neurocomputing 210, 227–236 (2016)
  • [27] Tan, S.S., Na, J.C.: Mining semantic patterns for sentiment analysis of product reviews. In: International Conference on Theory and Practice of Digital Libraries. pp. 382–393. Springer (2017)
  • [28] Zhai, S., Zhang, Z.M.: Semisupervised autoencoder for sentiment analysis. In: AAAI. pp. 1394–1400 (2016)
  • [29] Zhang, P., Komachi, M.: Japanese sentiment classification with stacked denoising auto-encoder using distributed word representation. In: Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation. pp. 150–159 (2015)
  • [30]

    Zhou, H., Chen, L., Shi, F., Huang, D.: Learning bilingual sentiment word embeddings for cross-language sentiment classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). vol. 1, pp. 430–440 (2015)