End-to-End Natural Language Understanding Pipeline for Bangla Conversational Agents

07/12/2021
by   Fahim Shahriar Khan, et al.
0

Chatbots are intelligent software built to be used as a replacement for human interaction. However, existing studies typically do not provide enough support for low-resource languages like Bangla. Moreover, due to the increasing popularity of social media, we can also see the rise of interactions in Bangla transliteration (mostly in English) among the native Bangla speakers. In this paper, we propose a novel approach to build a Bangla chatbot aimed to be used as a business assistant which can communicate in Bangla and Bangla Transliteration in English with high confidence consistently. Since annotated data was not available for this purpose, we had to work on the whole machine learning life cycle (data preparation, machine learning modeling, and model deployment) using Rasa Open Source Framework, fastText embeddings, Polyglot embeddings, Flask, and other systems as building blocks. While working with the skewed annotated dataset, we try out different setups and pipelines to evaluate which works best and provide possible reasoning behind the observed results. Finally, we present a pipeline for intent classification and entity extraction which achieves reasonable performance (accuracy: 83.02 recall: 83.02

READ FULL TEXT
research
03/29/2021

Text Normalization for Low-Resource Languages of Africa

Training data for machine learning models can come from many different s...
research
05/15/2021

From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

The lack of publicly available evaluation data for low-resource language...
research
10/18/2021

Intent Classification Using Pre-Trained Embeddings For Low Resource Languages

Building Spoken Language Understanding (SLU) systems that do not rely on...
research
10/26/2022

End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English

Automation of on-call customer support relies heavily on accurate and ef...
research
11/22/2022

Predicting the Type and Target of Offensive Social Media Posts in Marathi

The presence of offensive language on social media is very common motiva...
research
04/22/2022

Detecting early signs of depression in the conversational domain: The role of transfer learning in low-resource scenarios

The high prevalence of depression in society has given rise to the need ...
research
08/02/2021

ConveRT for FAQ Answering

Knowledgeable FAQ chatbots are a valuable resource to any organization. ...

Please sign up or login with your details

Forgot password? Click here to reset