Since the early days of AI, research has been inspired by the idea of developing programs that can communicate with users in natural language. With the advent of language technologies able to reach human performance in various tasks, AI chatbots and dialogue systems are starting to mature and this vision seems nearer than ever. As a result, more organizations are investing in chatbot development and deployment. In the 2019 Gartner CIO Survey, CIOs identified chatbots as the main AI-based application used in their enterprises,111 https://www.gartner.com/smarterwithgartner/chatbots-will-appeal-to-modern-workers/ with a global market valued in the billions of USD.222 https://www.mordorintelligence.com/industry-reports/chatbot-market
In fact, chatbots are one example of the extent AI technologies are becoming ever more pervasive, both in addressing global challenges, and in the day-to-day routine. Public administrations too are adopting chatbots for key actions such as helping citizens in requesting services333https://www.canada.ca/en/employment-social-development/services/my-account/terms-use-chatbot.html and providing updates and information, for example, in relation with COVID-19 [chatbotspandemic].444https://government.economictimes.indiatimes.com/news/digital-india/covid-19-govt-launches-facebook-and-messenger-chatbot/74843125
However, the expansion of intelligent technologies has been met by growing concerns about possible misuses, motivating a need to develop AI systems that are trustworthy. On the one hand, governments are pressured for gaining or preserving an edge in intelligent technologies, which make intensive use of large amounts of data. On the other hand, there is an increasing awareness of the need for trustworthy AI systems.555https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai
In the context of information-providing chatbots and assistive dialogue systems, especially in the public sector, we believe that trustworthiness demands transparency, explainability, correctness, and it requires architectural choices that take data access into account from the very beginning. Arguably, this kind of chatbot should not only use transparent and verifiable methods and be so conceived as to respect relevant data protection regulations, but it should also be able to explain its outputs or recommendations in a manner adapted to the intended (human) user.
We thus propose an architecture for AI dialogue systems where user interaction is carried out in natural language, not only for providing information to the user, but also to answer user queries about the reasons leading to the system output (explainability). The system selects answers based on a transparent reasoning module, built on top of a computational argumentation framework with a rigorous, verifiable semantics (transparency, auditability). Additionally, the system has a modular architecture, so as to decouple the natural language interface, where user data is processed, from the reasoning module, where expert knowledge is used to generate outputs (privacy and data governance).
Our work is positioned at the intersection of two areas: computational argumentation and natural language understanding. While computational argumentation has had significant applications in the context of automated dialogues among software agents, its combination with systems able to interact in natural language in socio-technical systems has been more recent. The most related proposal in this domain is a recent one by Chalaguine and Hunter [Hunter]. With respect to such work, our focus is not on persuading the user but on offering correct information. Accordingly, we put greater emphasis on the correctness and justification of system outputs, and on the system’s ability to reason with every relevant user input, as opposed to reacting to the last input. Our modular architecture enables a separation between language understanding and argumentative reasoning, which enables significant generality. In particular, our dialogue system architecture can be applied to multiple domains, without requiring any expensive retraining.
In this article we focus on the system’s architecture and on the knowledge representation and reasoning module. We start with a brief overview of related approaches (Section 2). Next, we give a high-level description of the system architecture (Section 3) and then zoom in on the argumentation module supporting knowledge representation and reasoning and dialogue strategies (Section 4). To illustrate, we sketch a dialogue between chatbot and human in the context of COVID-19 vaccines (Section 5), showing how background knowledge and user data can be formalized and jointly used to provide correct answers, and how the system output can be challenged by the user. Section 6 concludes.
2 Related Work
In the field of computational argumentation, significant work has been devoted to defining and reasoning over the argumentation graphs [Baroni2009, CharwatDGWW15, FazzingaFF19], leading to several ways of identifying “robust” arguments or sets of arguments [Dung95, Dung2007]. However, the practical combination of computational argumentation and dialogue systems based on natural language has not been much explored. Among the few existing approaches, Rosenfeld and Kraus [10.3233/978-1-61499-672-9-320]
combine theoretical argumentation with reinforcement learning to develop persuasive agents, while Rach et al.[10.1007/978-981-13-9443-0_12] extract a debate’s argument structure and envision the dialogue as a game, structuring the answers as moves along a previously defined scheme. In both cases the agents are limited in their inputs and outputs to sentences “hard-coded” in the knowledge base.
An interesting approach in this direction is by Chalaguine and Hunter [Hunter], who exploit sentence similarity to retrieve an answer from a knowledge base expressed in the form of a graph. No conversation history is kept, therefore the answers produced by the system do not take into account previous user inputs. We believe that this approach is inappropriate for complex scenarios where multiple pieces of information must be considered at the same time, since the user would have to include all of them in the same sentence. Moreover, this approach does not involve reasoning, but relevance-based answer retrieval. Our approach, instead, aims to output replies ‘consistent’ with all the information provided thus far by the user, and that will not be proven wrong later on. In particular, what we do is we enforce the condition of acceptance of some arguments, by eliciting specific user input. This can be seen as a practical application of the concepts defined by Baumann and Brewka [BaumannB10]. In particular, our system relies on an argumentation module that maintains a history of the concepts expressed by the user and performs reasoning over an argumentation graph to compute the answer. It is therefore possible for the user to consider multiple information at the same time, to ask for more information if they are needed, and also to provide an explanation for the previous answers.
3 System Architecture
Our chatbot architecture consists of two core modules: the language module and the argumentation module. The former provides a natural language interface to the user input, while the latter deals with the problem of computing correct replies to be provided to the user, and it relies on computational argumentation. In this work, we will focus on the argumentation module, leaving the specific implementation of the language module for future developments.
We assume the presence of a scenario-specific knowledge base (KB) created by experts, in the form of an argumentation graph (see Section 3) with two kinds of nodes. Nodes are either status arguments or reply arguments. The former encode facts that correspond to the possible user sentences. Each status node is linked to one or more reply arguments it supports666We point out that our concept of support is a new notion linking status nodes to reply nodes, and its semantics is different from the standard one [CayrolL05a, FazzingaFF18], and that represent replies to the facts stated by the user. Status nodes may also attack other status or reply nodes, typically because the facts they represent are incompatible with one another. Additionally, a set of natural language sentences is associated with each status node and represents some possible ways a user would express the facts the node encodes. These different representations of facts could be produced by domain experts or crowd-sourced.
The behaviour of the system and the interaction between the modules is illustrated in Figure 1. The language module compares each user sentence against the sentences embedded in the KB. In particular, like Chalaguine and Hunter [Hunter], we propose to use a sentence similarity measure to identify KB sentences matching the user input. Since each KB sentence is associated with a status node, a list of related status nodes can be computed from the list of sentences in the KB identified by the language module as a match. Accordingly, when a user writes a sentence, a set of status nodes is ‘activated’, in the sense that they are recognized as matching with the user’s input. However, differently from Chalaguine and Hunter [Hunter], all the status arguments activated during the chat with the user are stored in a set .
The fundamental principle that characterizes our approach is that a reply among those supported by is given to the user only if it is acceptable w.r.t. . This means that the information given by the user needs to support and defend from its attacks. If there is no acceptable reply with respect to , the chatbot selects anyway a candidate reply , but instead of offering immediately, it prompts the user in order to acquire new information that could activate new status arguments which, added to , could make acceptable w.r.t. . This elicitation process aims to guarantee that is not proven wrong in the continuation of the chat. In fact, all the information that can be in contrast with (i.e., that attack ) are asked to the user, in order to be sure to defeat any potential attackers.
This underlying strategic reasoning marks a significant difference from previous approaches. Another distinguishing feature is our system’s ability to provide users with online, on-demand explanations. In particular, besides providing information and getting replies, users can also require an explanation for a given reply . An explanation for consists of a sequence of natural language sentences built from descriptions of the status nodes of supporting and motivations against other possible conflicting replies that the system discarded.
4 Argumentation Module
The argumentation module is based on a knowledge base expressed as an argument graph.
Definition 1 (Argumentation graph)
An argumentation graph is a tuple , where and are the arguments of the graph and are called status arguments and reply arguments, respectively, encodes the attack/defeat relation, and encodes the support relation.
Each argument in is annotated with a set of natural language sentences, as described in the previous section. We say that attacks (resp., supports) a reply node iff (resp., ). By extension, we say that a set attacks (resp., supports) , or equivalently that is attacked by (resp., supported by) , iff there exists an argument s.t. attacks (resp., supports) .
The aim of the argumentation module is to identify the reply nodes in response to the user sentences. To this end, in addition to the KB, each dialogue session relies on dynamically acquired knowledge, expressed as a set of facts or status arguments . The dialogue strategy is to provide the user with a reply that is supported and defended by . However, differently from other proposals, our system does not simply select a consistent reply at each turn. On the contrary, it strategizes in order to provide only robust replies, possibly delaying replies that need further fact-checking. To that end, the two following definitions distinguish between consistent and potentially consistent reply. The former can be given to the user right away, as it can not possibly be proven wrong in the future.777 The implicit assumption here is that the user does not enter conflicting information, and that the language model correctly interprets the user input. Clearly, if this is not the case, the system’s output becomes unreliable. But that wouldn’t depend on the underlying reasoning framework. The definition of fall-back strategies able to handle such exceptions would be an important extension to the system. The latter, albeit consistent with the current known facts, may still be defeated by future user input, and therefore it should be delayed until a successful elicitation process is completed.
The formal definitions are based on the KB and on a representation of the state of the dialogue consisting of two sets: and . In particular, contains the arguments activated during the conversation so far, whereas contains arguments in support of the system’s possible replies to the user. We recall that an argument is acceptable w.r.t. a set iff defends from every attack towards .
Definition 2 (Consistent reply)
Given an argumentation graph and two sets and , a reply is consistent iff supports and is acceptable w.r.t. .
Definition 3 (Potentially consistent reply)
Given an argumentation graph and two sets and , a reply is potentially consistent iff supports , does not attack and is not acceptable w.r.t. .
Finally, users can challenge the system output. An explanation of a reply consists of two parts. The first one contains the arguments leading to , i.e., those belonging to a set that supports . The second one encodes the why nots, to explain why the chatbot did not give other replies.
Definition 4 (Explanation)
Given an argumentation graph , a set and a reply , an explanation for is a pair , where contains the arguments s.t. and is a set of pairs , where , is supported by and contains the arguments attacking .
In the next section we briefly explain how our strategy works to provide the user with consistent replies, by means of an example in the context of the COVID-19 vaccines.
5 Case Study
Disclaimer. The illustration that follows is based on a (simplistic) representation of the domain knowledge. Its purpose is to show a proof of concept of our approach–not to offer sound advice about vaccines. We base our example on the content of the AIFA website.888Italian medicines agency, https://www.aifa.gov.it/en/vaccini-covid-19.
We consider the context of the vaccines for COVID-19, where we aim to create a dialogue system able to answer user inquiries about vaccination procedures, vaccine safety, and so on. Figure 2 shows an excerpt of the argumentation graph encoding the KB, in particular the part related to options for getting vaccinated.
Yellow rectangles represent status arguments, blue ovals reply arguments, green solid arrows support relations, pointing to the possible replies to user sentences, and red dotted arrows denote attack relations. It is worthwhile noticing that the graph contains both the positive and negative version of each status argument. This is a key modeling feature in the context at hand, as it enables the chatbot to properly capture and encode all the information provided by the user about their health conditions.
Let us consider this example: the user writes “Hi, I am Morgan and I suffer from latex allergy, can I get vaccinated?” The language module processes the user sentence and compares it against all the sentences provided by the knowledge base, resulting in a single positive match with the sentence “I have latex allergy” associated with node . At this point, the argumentation module deals with the computation of the replies, finding that the only reply supported by is and that it is not a consistent reply, because it is attacked by both and . It is, however, a potentially consistent reply: thus, although we cannot give it yet to the user, what we can do is acquire new information that would make it consistent. To make consistent, must be augmented with both and . This means that the user must tell that they do not suffer from bronchial asthma and that they had no previous anaphylaxis. Then, our strategy is to query the user whether they suffer from bronchial asthma and/or whether they had any previous anaphylaxis. Assume at this point that the user replies are I do not suffer from bronchial asthma and I have never had any anaphylaxis. Then, we can extend with the new corroborating bits of information, obtaining . Because is now a consistent reply, we can return to the user.
Alternatively, suppose that the user writes that they do suffer from bronchial asthma. In that case, we would have , hence would not be a consistent reply. Accordingly, the only consistent reply that can be given to the user would be .
Finally, suppose that, upon getting as a reply, the user asks for an explanation. In that case, is such that , and consists of the unique pair , meaning that was not given due to , that is, due to the fact that the user suffers from bronchial asthma.
We presented a new modular dialogue system architecture based on computational argumentation and language technologies. In particular, our system exploits both user input and a knowledge base built by domain experts to perform reasoning in order to compute answers and identify missing bits of information. We illustrated our proposal with an information-seeking scenario, where a user requires information about COVID-19 vaccines.
Our proposal has multiple advantages over previous approaches. With respect to corpus-based dialogue systems, it can use expert knowledge. This is especially important in domains that require trustworthy, correct and explainable solutions. Indeed, a remarkable feature of argumentation graphs is their ability to support reasoning over the conflicts between arguments, leading to approving or discarding some responses. We believe that highlighting the reasons why a response can not be given, along with the facts that rule out other possible responses, is a good way to make the user understand the response and trust the system. Importantly, the architecture is general-purpose and does not require domain-specific training or reference corpora. With respect to prior work on argumentation-based dialogue systems, its major advantage is its ability to reason with multiple elements of user information, in order to provide focused and sound answers, by eventually performing the elicitation of missing data.
In this paper we focused on the argumentation module, leaving the implementation of the language module for future works. In this regard, we plan to explore the use of recent attention-based neural architectures [attention] by representing the user input using BERT-based [BERT] sentence embeddings [SBERT] and by comparing them using advanced similarity measures [cross-lingual].
Since our proposal is general and not limited to a specific domain, it will be interesting to test our approach on new scenarios and also to consider languages other than English. Another important aspect we plan to address in the future is the management of conflicting information provided by the user, and the possibility to revise previously submitted information.
The research reported in this work was partially supported by the EU H2020 ICT48 project “Humane AI Net” under contract #952026.