1. Background and Motivation
Understanding the user’s intents for general Web search engines have been well studied in the last decades(Broder, 2002; Chen, 2009). However, for E-Com domain, user intent understanding is in its initial stages (Zhai, 2018). In recent years, with the drastic increase of online shopping and E-Com sales growth, there is a high demand for designing effective user intent understanding models to optimize the search results, which can directly impact different business metrics such as gross demand. Furthermore, developing an effective conversational search that can interact with users in a conversation-like manner has been a long-term goal in designing a search engine. Recently, with the new developments in speech recognition technology (Stolcke, 2018), companies like Amazon goes further, in that they invest to develop their own socialbots that can keep a coherent conversation in an open-domain and human-like manner with their customers (Agichtein, 2017, 2018). As a result, developing an accurate and efficient Natural Language Understanding (NLU) unit for these socialbots has been in the spotlight in recent years (Ram, 2018).
The new problems for the user intent understanding raised by these new applications open opportunities for researchers to develop new ideas in these fields. To keep bridging these gaps, my Ph.D. dissertation focuses on user intent understanding for either conversational agents and E-Com search engines, which has been done with collaboration with Dr. Eugene Agichtein. My dissertation consists of two phases of research development. The first part of my dissertation focused on developing multiple machine learning models to develop a NLU component for conversational agents. To this end, my main research objectives were as follows:
Incorporating contextual information into intent classification in an open-domain conversational agent.
Developing a method to utilize the knowledge available from human conversations for a human-machine conversation.
Implementing an entity-aware topic classification model for an open-domain conversational agent.
Developing an online satisfaction prediction mechanism in an open-domain conversational agent.
In the second phase of my dissertation, I plan to address specific issues related to user intent mining and classification for E-Com search engines. User intent understanding for search engines is similar to conversational bots. However, for the search engines, there are much more data and user behavior information such as click rates available than conversational bots. My main research questions leading my work are as follows:
How efficient is it to develop a joint learning model to simultaneously learn the search query’s product category and user intents?
How can we discover users’ hidden intents encoded in E-com search logs?
2. Overview of Research Directions
In this section, my research direction and different projects relevant to it are described. The first section is an overview of the Amazon Alexa Prize competitions111https://developer.amazon.com/alexaprize and my contributions within them. In the second section, my primary research on user intent understanding for E-Com search engines is discussed.
2.1. Amazon Alexa Prize 2017 and 2018
For Amazon Alexa Prize 2017 (Agichtein, 2017), our team developed one of the first socialbots named Emersonbot that can converse with general users in an open-domain manner. Our approach to solve such a challenge was considering a conversation as a search engine, while the input consists of human utterances.
For Alexa Prize 2018 (Agichtein, 2018)
, much like Alexa Prize 2017, we developed an open domain socialbot named IrisBot that can converse in multiple domains with real users. We improved the capability of IrisBot with respect to EmersonBot in several aspects and I published several papers documenting this work. First, for user intent classification, I proposed a novel method, CDAC (Contextual Dialogue Act Classifier), a simple yet effective deep learning approach for contextual dialogue act (broad intent) classification. Specifically, we used transfer learning to adapt models trained on human-human conversations to predict dialogue acts in human-machine dialogues(Agichtein, 2019b). Then, for topic classification, we introduced a Concurrent Conversational Entity-aware Topic classifier (ConCET), which incorporates entity-type information together with the utterance content features. Specifically, ConCET utilizes entity information to enrich the utterance representation, combining character, word, and entity-type embeddings into a single representation (Agichtein, 2019a). Finally, we proposed a Conversational Satisfaction prediction model specifically designed for open-domain conversational agents, called ConvSAT. To operate robustly across domains, ConvSAT aggregates multiple representations of the conversation, namely the conversation history, utterance and response content, and system- and user-oriented behavioral signals (Agichtein, 2019c).
2.2. E-Commerce Search Engines
This part of my research was done during my internship in The Home Depot (THD)222https://www.homedepot.com/. THD receives billions of search queries every year, and collects terabytes of data logs from user experience during interaction with their website. My research focused on enhancing their high-level user intent understanding module. To this end, we proposed a hierarchical architecture for the user intent classification, where in the first layer, the intent of the users in purchasing a product or seeking information (product vs informational) is discovered. Then, in the next layer, if the user intent is purchasing, another intent classifier was applied to determine whether the query is either broad or specific. Otherwise, if the intent of the user is information seeking, another classifier determines the type of information the user is looking for. As a result, the search engine can guide a user to an appropriate web page to handle the user’s request.
3. Future Research
For the remaining part of my dissertation, I plan to extend my research to answer a couple of questions raised while working with a large E-Com search engine like that of The Home Depot.
Boosting Search Performance Using Joint Learning.
Joint and multitask learning are crucial for an E-Com search engine in both engineering and performance perspectives. It is important from the engineering perspective because only one classifier is deployed instead of multiples, this contributes to reducing overhead and enhances maintenance of the system. It is also effective for implementation of relevant intent classification tasks due to the transferring of the inductive bias between two tasks. In the E-Com domain, accurate classification of queries will help with identifying the right product groupings from which relevant products can be retrieved. Additionally, search queries in this domain tend to be short, ambiguous, and the vocabulary tends to change as the catalog evolves. For product search, it might be necessary to identify intents associated with a query across various granularities such as category intent, accessory intent, vertical intent, etc. These intent classification tasks are related, knowledge from one task might help improve the performance of other tasks. To this end, I plan to design a joint learning model of high-level intent mapping (product vs. informational), and product category mapping as well as informational type classification to improve the performance of these three individual intent classification tasks.
Hidden User Intent Mining and Discovery.
The proposed hierarchical intent architecture is effective, however, there might be new users’ intents that are hidden. Providing a customized and fine-grained intent hierarchy model will assist an E-Com search engine to implement more efficiently intent classifiers, and consequently improve the user intent understanding. Unfortunately, existing research for E-Com setting in this topic is very limited. In an effort to advance this research field, researchers at Walmart introduced a clustering model (Zhai, 2018) to extract the similarities between the users’ behavior information while exploring the website. They suggested five main clusters only based on the behavioral data for their commercial website. This model solely relies on user behavior data while they lose the information incorporated in the query semantics. In contrast, in designing an NLU unit for spoken dialogue systems like Amazon Alexa, query semantics are the preliminary and dominant source of information. To this end, in this project, inspired by an NLU unit in conversational bots, I plan to develop a new model to incorporate the query semantics into the behavior data to find new hidden users’ intents.
We gratefully acknowledge the financial and computing support from the Amazon Alexa team and The Home Depot data science team during my Ph.D. dissertation.
- Emersonbot: information-focused conversational ai emory university at the alexa prize 2017 challenge. In 1st Proceeding of Alexa Prize, , pp. . Cited by: §1, §2.1.
- Emory irisbot: an open-domain conversational bot for personalized information access. In 2nd Proceeding of Alexa Prize, , pp. . Cited by: §1, §2.1.
- ConCET: entity-aware topic classification for open-domain conversational agents. In Proceedings of CIKM, , pp. 1371–1380. Cited by: §2.1.
- Contextual dialogue act classification for open-domain conversational agent. In Proceedings of SIGIR, Paris, pp. 1273–1276. Cited by: §2.1.
- Offline and online satisfaction prediction in open-domain conversational systems. In Proceedings of CIKM, , pp. 1281–1290. Cited by: §2.1.
- A taxonomy of web search. In ACM SIGIR forum, pp. 3–10. Cited by: §1.
- Understanding user’s query intent with wikipedia. In Proceeding of WWW, pp. 1471–480. Cited by: §1.
- Topic-based evaluation for conversational bots. In NIPS, pp. . Cited by: §1.
- The microsoft 2017 conversational speech recognition system. In ICASSP, pp. 5934–5938. Cited by: §1.
- A taxonomy of queries for e-commerce search. In Proceeding of SIGIR, pp. 1245–1248. Cited by: §1, §3.