Keywords Extraction and Sentiment Analysis using Automatic Speech Recognition

04/07/2020
by   Rachit Shukla, et al.
0

Automatic Speech Recognition (ASR) is the interdisciplinary subfield of computational linguistics that develops methodologies and technologies that enables the recognition and translation of spoken language into text by computers. It incorporates knowledge and research in linguistics, computer science, and electrical engineering fields. Sentiment analysis is contextual mining of text which identifies and extracts subjective information in the source material and helping a business to understand the social sentiment of their brand, product or service while monitoring online conversations. According to the speech structure, three models are used in speech recognition to do the match: Acoustic Model, Phonetic Dictionary and Language Model. Any speech recognition program is evaluated using two factors: Accuracy (percentage error in converting spoken words to digital data) and Speed (the extent to which the program can keep up with a human speaker). For the purpose of converting speech to text (STT), we will be studying the following open source toolkits: CMU Sphinx and Kaldi. The toolkits use Mel-Frequency Cepstral Coefficients (MFCC) and I-vector for feature extraction. CMU Sphinx has been used with pre-trained Hidden Markov Models (HMM) and Gaussian Mixture Models (GMM), while Kaldi is used with pre-trained Neural Networks (NNET) as acoustic models. The n-gram language models contain the phonemes or pdf-ids for generating the most probable hypothesis (transcription) in the form of a lattice. The speech dataset is stored in the form of .raw or .wav file and is transcribed in .txt file. The system then tries to identify opinions within the text, and extract the following attributes: Polarity (if the speaker expresses a positive or negative opinion) and Keywords (the thing that is being talked about).

READ FULL TEXT

page 6

page 14

research
06/11/2021

Leveraging Pre-trained Language Model for Speech Sentiment Analysis

In this paper, we explore the use of pre-trained language models to lear...
research
02/28/2019

Incorporating End-to-End Speech Recognition Models for Sentiment Analysis

Previous work on emotion recognition demonstrated a synergistic effect o...
research
03/08/2020

Development of Automatic Speech Recognition for Kazakh Language using Transfer Learning

Development of Automatic Speech Recognition system for Kazakh language i...
research
04/22/2022

WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment

Historically lower-level tasks such as automatic speech recognition (ASR...
research
11/21/2019

Speech Sentiment Analysis via Pre-trained Features from End-to-end ASR Models

In this paper, we propose to use pre-trained features from end-to-end AS...
research
02/21/2023

Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys

In this paper, we explore the application of language and speech technol...
research
07/15/2018

Syllabification by Phone Categorization

Syllables play an important role in speech synthesis, speech recognition...

Please sign up or login with your details

Forgot password? Click here to reset