Chatbots are computer programs that interact with users in natural language 119. The origin of the chatbots concept dates back to 1950 137.
ELIZA 148 and A.L.I.C.E. 141 are examples of early chatbots technologies, where the main goal was to mimic human conversations. Over the years, the chatbot concept has evolved. Today, chatbots may have characteristics that distinguish one agent from the others, which resulted in several synonyms, such as multimodal agents, chatterbots, and conversational interfaces. In this survey, we use the term “chatbot” to refer to a disembodied conversational agent that held a natural language conversation via text-based environment to either engage the user in a general-purpose or task-oriented conversation.
a disembodied conversational agent that held a natural language conversation via text-based environment to either engage the user in a general-purpose or task-oriented conversation.
Chatbots are changing the patterns of interactions between humans and computers 52. Many instant messenger tools, such as Skype, Facebook Messenger, and Telegram provide platforms to develop and deploy chatbots, who either engage with users in general conversations or help them solve domain specific tasks 31. As the messaging tools become platforms, traditional websites and apps are providing space for this new form of human-computer interaction (HCI) 52. For example, in the 2018 F8 Conference, Facebook announced having 300K chatbots active on Facebook Messenger 14. The BotList website indexes thousands of chatbots for education, entertainment, games, health, productivity, travel, fun, and several other categories. The growth of chatbot technology is changing how companies engage with their customers 54; 17, students engage with their learning groups 61; 129, and patients self-monitor the progress of their treatment 49, among many other applications.
However, chatbots still fail to meet users’ expectations 78; 66; 17; 150. While many studies on chatbots design focus on improving chatbots’ functional performance and accuracy (see e.g. 68; 82), the literature has consistently suggested that chatbots’ interactional goals should also include social capabilities 66; 77. According to the Media Equation theory 110, people naturally respond to social situations when interacting with computers 93; 51. As chatbots are designed to interact with users in a way that mimics person-to-person conversations, new challenges in HCI arise 96; 52. 95 state that making a conversational agent acceptable to users is primarily a social, not only technical, problem to solve. In fact, studies on chatbots have shown that people prefer agents who: conform to gender stereotypes associated with tasks 53; self-disclose and show reciprocity when recommending 74; and demonstrate a positive attitude and mood 131. When chatbots do not meet these expectations, the user may experience frustration and dissatisfaction 78; 150.
Although chatbots’ social characteristics have been explored in the literature, this knowledge is spread across several domains in which chatbots have been studied, such as customer services, education, finances, and travel. In the HCI domain, some studies focus on investigating the social aspects of human-chatbot interactions (see, e.g., 25; 64; 74). However, most studies focus on a single or small set of characteristics (e.g., 79; 117); in other studies, the social characteristics emerged as secondary, exploratory results (e.g., 127; 135). It has become difficult to find evidence regarding what characteristics are important for designing a particular chatbot, and what research opportunities exist in the field. A lack of studies bring together the social characteristics that influence the way users perceive and behave toward chatbots.
To fill this gap, this survey compiles research initiatives for understanding the impact of chatbots’ social characteristics on the interaction. We bring together literature that is spread across several research areas. From our analysis of 58 scientific studies, we derive a conceptual model of social characteristics, aiming to help researchers and designers identify what characteristics are relevant to their context and how their design choices influence the way humans perceive the chatbots. The research question that guided our investigation was: What chatbot social characteristics benefit human interaction and what are the challenges and strategies associated with them?
To answer this question, we discuss why designing a chatbot with a particular characteristic can enrich the human-chatbot interaction. Our results can both provide insight into whether the characteristic is desirable for a particular chatbot, as well as inspire researchers’ further investigations. In addition, we discuss the interrelationship among the identified characteristics. We stated 22 propositions about how social characteristics may influence one another. In the next section, we present an overview of the studies included in this survey.
2. Overview of the surveyed literature
The literature presents no coherent definition of chatbots; thus, to find relevant studies we used a search string that includes the synonyms chatbots, chatterbots, conversational agents, conversational interfaces, conversational systems, conversation systems, dialogue systems, digital assistants, intelligent assistants, conversational user interfaces, and conversational UI. We explicitly left out studies that relate to embodiment (e.g., ECA, multimodal, robots, eye-gaze, gesture), and speech input mode (e.g., speech-based, speech-recognition, voice-based). We did not include the term social bots, because it refers to chatbots that produce content for social networks such as Twitter 48. We did not include personal assistants either, since this term consistently refers to commercially available, voice-based assistants such as Google Assistant, Amazon Alexa, Apple Siri, and Microsoft Cortana. We decided to not include terms that relate to social characteristics/traits, because most studies do not explicitly label their results as such.
After filtering the search results, we had 58 remaining studies. Most of the selected studies are recent publications (less than 10 years). The publication venues include the domains of human-computer interactions (25 papers), learning and education (9 papers), information and interactive systems (8 papers), virtual agents (5 papers), artificial intelligence (3 papers), and natural language processing (3 papers). We also found papers from health, literature & culture, internet science, computer systems, communication, and humanities (1 paper each). Most papers (59%) focus on task-oriented chatbots. General purpose chatbots reflect 33% of the surveyed studies. Most general purpose chatbots (16 out of 19) are designed to handle topic-unrestricted conversations. The most representative specific-domain is education, with 9 papers, followed by customer services, with 5 papers. See the AppendixA (Supplemental Materials) for the complete list of topics.
We analyzed the papers by searching for chatbot behavior or attributed characteristics that influence the way users perceive it and behave toward it. Noticeably, the characteristics and categories are seldom explicitly pointed to in the literature, so the conceptual model was derived using a qualitative coding process inspired in methods such as Grounded Theory 6 (open coding stage). For each study (documents), we selected relevant statements from the paper (quotes) and labeled them as a characteristic (code). After coding all the studies, a second researcher reviewed the produced set of characteristics and discussion sessions were performed to identify characteristics that could be merged, renamed, or removed. At the end, the characteristics were grouped into the categories, depending on whether the characteristic relates to the chatbot’s virtual representation, conversational behavior, or social protocols. Finally, the quotes for each characteristic were labeled as references to benefits, challenges, or strategies.
We derived a total of 11 social characteristics, and grouped them into three categories: conversational intelligence, social intelligence, and personification. The next section describes the derived conceptual model.
3. Chatbots Social Characteristics
This section describes the identified social characteristics grouped into categories. As Table 1 depicts, the category conversational intelligence includes characteristics that help the chatbot manage interactions. Social intelligence focuses on habitual social protocols, while personification refers to the chatbot perceived identity and personality representations. In the following subsections, we independently describe the identified social characteristics, and point to the relationships to other characteristics when relevant. Then, we summarize the relationship among the characteristics in Section 4. For each category, a table with an overview of the included studies is provided in the supplementary materials (Appendix A). The supplementary materials also include a table for each social characteristic, listing the studies associated with the reported benefits, challenges, and strategies. Finally, the supplementary materials also highlight five constructs that can be used to assess whether social characteristics are reaching the intended design goals.
|Conversational Intelligence||Proactivity||[B1] to provide additional information||[C1] timing and relevance||[S1] to leverage conversational context|
|[B2] to inspire users and to keep the conversation alive||[C2] privacy||[S2] to select a topic randomly|
|[B3] to recover from a failure||[C3] users’ perception of being controlled|
|[B4] to improve conversation productivity|
|[B5] to guide and engage users|
|Conscientiousness||[B1] to keep the conversation on track||[C1] to handle task complexity||[S1] conversational flow|
|[B2] to demonstrate understanding||[C2] to harden the conversation||[S2] visual elements|
|[B3] to hold a continuous conversation||[C3] to keep the user aware of the chatbot’s context||[S3] confirmation messages|
|Communicability||[B1] to unveil functionalities||[C1] to provide business integration||[S1] to clarify the purpose of the chatbot|
|[B2] to manage the users’ expectations||[C2] to keep visual elements consistent with textual inputs||[S2] to advertise the functionality and suggest the next step|
|[S3] to provide a help functionality|
|Social Intelligence||Damage control||[B1] to appropriately respond to harassment||[C1] to deal with unfriendly users||[S1] emotional reactions|
|[B2] to deal with testing||[C2] to identify abusive utterances||[S2] authoritative reactions|
|[B3] to deal with lack of knowledge||[C3] to balance emotional reactions||[S3] to ignore the user’s utterance and change the topic|
|[S4] conscientiousness and communicability|
|[S5] to predict users’ satisfaction|
|Thoroughness||[B1] to adapt the language dynamically||[C1] to decide on how much to talk||Not identified|
|[B2] to exhibit believable behavior||[C2] to be consistent|
|Manners||[B1] to increase human-likeness||[C1] to deal with face-threatening acts||[S1] to engage in small talk|
|[C2] to end a conversation gracefully||[S2] to adhere turn-taking protocols|
|Moral agency||[B1] to avoid stereotyping||[C1] to avoid alienation||Not identified|
|[B2] to enrich interpersonal relationships||[C2] to build unbiased training data and algorithms|
|Emotional intelligence||[B1] to enrich interpersonal relationships||[C1] to regulate affective reactions||[S1] to use social-emotional utterances|
|[B2] to increase engagement||[S2] to manifest conscientiousness|
|[B3] to increase believability||[S3] reciprocity and self-disclosure|
|Personalization||[B1] to enrich interpersonal relationships||[C1] privacy||[S1] to learn from and about the user|
|[B2] to provide unique services||[S2] to provide customizable agents|
|[B3] to reduce interactional breakdowns||[S3] visual elements|
|Personification||Identity||[B1] to increase engagement||[C1] to avoid negative stereotypes||[S1] to design and elaborate on a persona|
|[B2] to increase human-likeness||[C2] to balance the identity and the technical capabilities|
|Personality||[B1] to exhibit believable behavior||[C1] to adapt humor to the users’ culture||[S1] to use appropriate language|
|[B2] to enrich interpersonal relationships||[C2] to balance the personality traits||[S2] to have a sense of humor|
3.1. Conversational Intelligence
Conversational intelligence enables the chatbot to actively participate in the conversation and to demonstrate awareness of the topic discussed, the evolving conversational context, and the dialogue flow. Therefore, conversational Intelligence refers to the ability of a chatbot to effectively converse beyond the technical capability of achieving a conversational goal 66. In this section, we discuss social characteristics related to conversational intelligence, namely: proactivity (18 studies), conscientiousness (11 studies), and communicability (6 studies). Most of the studies rely on data that comes from the log of the conversations, interviews, and questionnaires. The questionnaires are mostly Likert-scales, and some of them include subjective feedback. Most studies analyzed the interaction with real chatbots, although Wizard of Oz (WoZ) settings are also common. In WoZ, participants believe to be interacting with a chatbot when, in fact, a person (or wizard) pretends to be the automated system 30. Only two papers did not evaluate a particular type of interaction because they were based on a literature review 54 and surveys with chatbot’s users in general 16. See the supplementary materials for details (Appendix A).
Proactivity is the capability of a system to autonomously act on the user’s behalf 114 to reduce the amount of human effort to complete a task 130. In human-chatbot conversations, a proactive behavior enables a chatbot to share initiative with the user, contributing to the conversation in a more natural way 89. Chatbots may manifest proactivity when they initiate exchanges, suggests new topics, provide additional information, or formulate follow-up questions. In this survey, we found 18 papers that report either chatbots with proactive behavior or implications of manifesting a proactive behavior. Proactivity (also addressed as “intervention mode”) was explicitly addressed in seven studies 76; 8; 118; 24; 61; 129; 43. In most of the studies, however, proactivity emerged either as an exploratory result, mostly from post-intervention interviews and user’s feedback 102; 122; 66; 42; 131; 89, or as a strategy to attend to domain-specific requirements (e.g., monitoring, and guidance) 123; 83; 127; 135; 49.
The surveyed literature evidences several benefits of offering proactivity in chatbots:
[B1] to provide additional, useful information: literature reveals that proactivity in chatbots adds value to interactions 89; 131; 8. Investigating evaluation criteria for chatbots, 89 asked users of a general purpose chatbot to rate the chatbots’ naturalness and report in what areas they excel. Both statistical and qualitative results confirm that taking the lead and suggesting specialized information about the conversation theme correlates to chatbots’ naturalness. 131 corroborates this result; in post-intervention interviews, ten out of 14 users mentioned they preferred a chatbot that takes the lead and volunteers to provide additional information such as useful links and song playlists. In a WoZ study, 8 investigated whether proactive interventions of a chatbot contribute to a collaborative search in a group chat. The chatbot either elicits or infers needed information from the collaborative chat and proactively intervenes in the conversation by sharing useful search results. The intervention modes were not significantly different from each other, but both intervention modes resulted in a statistically significant increase of enjoyment and decrease of effort when compared to the same task with no chatbot interventions. Moreover, in a post-intervention, open-ended question, 16 out of 98 participants self-reported positive perceptions about the provided additional information.
[B2] to inspire users, and keep the conversation alive: proactively suggesting and encouraging new topics have been shown useful to both inspire users 24; 8 and keep the conversation alive 123. Participants in the study conducted by 8 self-reported that the chatbot’s suggestions helped them to get started (7 mentions) and gave them ideas about topics to search for (4 mentions). After iteratively evaluating prototypes for a chatbot in an educational scenario, 123 concluded that proactively initiating topics makes the dialogue more fun and reveals topics the chatbot can talk about. The refined prototype also proactively maintains the engagement by posing a follow-up when the student had not provided an answer to the question. 118 hypothesized that including follow-up questions based on the content of previous messages will result in higher perceived partner engagement. The hypothesis was supported, with participants in the dynamic condition rating the chatbot as more engaging. In an ethnographic data collection 127, users included photos in their responses to add information about their experience; 85% of these photos were proactively prompted by the chatbot. This result shows that prompting the user for more information stimulates them to expand their entries. 24 also observed that chatbots’ proactive messages provided insights about the chatbots’ knowledge, which potentially helped the conversation to continue. In this paper, we call the strategies to convey the chatbot’s knowledge and capabilities as communicability, and we discuss it in Section 3.1.3.
[B3] to recover the chatbot from a failure: in 102 and 123, proactivity is employed to naturally recover from a failure. In both studies, the approach was to introduce a new topic when the chatbot failed to understand the user or could not find an answer, preventing the chatbot from getting stuck and keeping the conversation alive. Additionally, in 123, the chatbot inserted new topics when users are either abusive or non-sensical. We call the strategies to handle failure and abusive behavior as damage control, and we discuss this characteristic in Section 3.2.1.
[B4] to improve conversation productivity: in task-oriented interactions, such as searching or shopping, proactivity can improve the conversation productivity 66. In interviews with first-time users of chatbots, 66 found that chatbots should ask follow-up questions to resolve and maintain the context of the conversation and reduce the search space until achieving the goal. 8 found similar results for collaborative search; 28 out of 98 participants self-reported that chatbot’s proactive interventions saved collaborators time.
[B5] to guide and engage users: in particular domains, proactivity helps chatbots to either guide users or establish and monitor users’ goals. In 49, the chatbot assigns a goal to the user and proactively prompts motivational messages and reminders to keep the user engaged in the treatment. 83 suggest that a decision-making coach chatbot needs to lead the interaction to guiding the user toward a decision. In ethnographic data collection 127, the chatbot prompts proactive messages that guide the users on what information they need to report. 135 evaluates a chatbot that manage tasks in a workplace. Proactive messages are used to check whether the team member has concluded the tasks, and then report the outcome to the other stakeholders. In the educational context, proactivity is used to develop tutors that engage the students and facilitate learning. In 61, the tutor chatbot was designed to provide examples of how other students made explanations about a topic. The network analysis of the learner’s textual inputs shows that students used more key terms and provided more important messages when receiving feedback about other group members. In 61, 43, and 129 the chatbots prompt utterances to encourage the students to reason about a topic. In all three studies, the chatbot condition provided better learning outcomes and increased students’ engagement in the discussions.
The surveyed papers also highlight challenges of providing proactive interactions, such as obtaining timing and relevance, privacy, and the user’s perception of being controlled.
[C1] timing and relevance: untimely and irrelevant proactive messages may compromise the success of the interaction. 102 states that untimely turn-taking behavior was perceived as annoying, negatively affecting emotional engagement. 76 and 24 reported that proactivity can be disruptive. 76 investigated proactivity in a workspace environment, hypothesizing that the perceived interruption of agent proactivity negatively affects users’ opinion. The hypothesis was supported, and the authors found that what influence the sense of interruption is the general aversion to unsolicited messages, regardless of whether it comes from a chatbot or a colleague. 24 showed that proactively introducing new topics resulted in a high number of ignored messages. The analysis of the conversation log reviewed that either the new topics were not relevant, or it was not the proper time to start a new topic. 123 also reported annoyance when a chatbot introduces repetitive topics.
[C2] privacy: in a work-related, group chat, 42 observed privacy concerns regarding the chatbot “reading” the employees’ conversations to act proactively. During a semi-structured interview, researchers presented a mockup of the chatbot to employees from two different enterprises and collected perceptions of usefulness, intrusiveness, and privacy. Employees reported the feeling that the chatbot represented their supervisors’ interests, which conveyed as sense of workplace surveillance. Privacy concerns may result in under-motivated users, discomfort about disclosing information, and lack of engagement 42.
[C3] user’s perception of being controlled: proactivity can be annoying when the chatbot conveys the impression of trying to control the user. 127 report that seven out of 13 participants reported irritation with the chatbot; one of the most frequent reasons was due to the chatbot directing them to specific places. For the task management chatbot, 135 reported to have adapted the follow-up questions approach to place questions in a time negotiated with the user. In a previous implementation, the chatbot checked the status of the task twice a day, which participants considered too frequent and annoying.
The surveyed literature also reveals two strategies to provide proactivity: leveraging the conversational context and randomly selecting a topic. [S1] Leveraging the conversational context is the most frequent strategy 24; 8; 122; 42; 131, in which proactive messages relate to contextual information provided in the conversation to increase the usefulness of interventions 8; 42; 122. 122 argue that general purpose, emotional aware chatbots should recognize user’s interests and intents from the conversational context to proactively offer comfort and relevant services. In 42, the chatbot leverages conversational context to suggest new topics and propose to add documents or links to assist employees in a work-related group chat. The chatbots studied by 24 introduce new topics based on keywords from previous utterances posted in the chat. According to 122, leveraging the context can also help to smoothly guide the user to a target topic. One surveyed paper 102 proposes a chatbot that [S2] selects a topic randomly but also observes that the lack of context is the major problem of this approach. Contextualized proactive interventions also suggest that the chatbot is attentive to the conversation, which conveys conscientiousness, which is discussed in the next section.
Conscientiousness is a chatbot’s capacity to demonstrate attentiveness to the conversation at hand 43; 41. It enables a chatbot to follow the conversational flow, show understanding about the context, and interpret each utterance as a meaningful part of the whole conversation 89. In this survey, we found 11 papers that reported findings related to conscientiousness for chatbot design. Four studies explicitly investigated influences of conscientiousness for chatbots 65; 118; 26; 9. In the remaining studies, conscientiousness emerged in exploratory findings. In 43, conscientiousness emerged from the analysis of conversational logs, while 54 elicited conscientiousness aspects as a requirement for chatbot design when surveying the literature on chatbots customer services. In the remaining studies 16; 66; 127; 89; 41, conscientiousness issues were self-reported by the users in post-intervention interviews and subjective feedback to open-ended questions.
The surveyed literature evidenced benefits of designing conscientious chatbots:
[B1] to provide meaningful answers:
some chatbots use simplistic approaches, like pattern matching rules based on keywords or template phrases applied in the last user utterance, to find the most appropriate response2; 108. However, as the chatbot does not interpret the meaning and the intent of users’ utterances, the best-selected response may still sound irrelevant to the conversation 43; 41; 26. As shown by 41, when a chatbot does not interpret the meaning of users’ utterances, users show frustration and the chatbot’s credibility is compromised. This argument is supported by 43 when studying chatbots to facilitate collaborative learning. The authors proposed a chatbot that promotes Academically Productive Talk moves. Exploratory results show that the chatbot performed inappropriate interventions, which was perceived as a lack of attention to the conversation. In 66, participants complained that some chatbots seemed “completely scripted,” ignoring user’s inputs that did not fall into the script. In this case, users needed to adapt their inputs to match the chatbot script to be understood, which resulted in dissatisfaction. Besides avoiding frustration, 118 showed that conscientiousness also influences chatbots’ perceived humanness and social presence. The authors invited participants to interact with a chatbot that shows an image and asks the users to describe it. To perform the task, the chatbot could either asks the same generic follow-up questions each time (nonrelevant condition) or responded with a follow-up question related to the last participant’s input (relevant condition), demonstrating attention to the information provided by the user. Statistical analysis of survey rates supported that the relevant condition increased chatbots’ perceived humanness and social presence. In 9, to motivate the students to communicate in a second language, the proposed chatbot interprets users’ input to detect breakdowns and react using an appropriate communication strategy. In interviews, participants reported “appreciating the help they got from the chatbot to understand and express what they have got to say” 9. This study was later extended to the context of ECAs (see 10 for details).
[B2] to hold a continuous conversation: a conversation with a chatbot should maintain a “sense of continuity over time” 66 to demonstrate that the chatbot is undertaking efforts to track the conversation. To do so, it is essential to maintain the topic. When evaluating the naturalness of a chatbot, 89 found that maintaining a theme is convincing, while failure to do so is unconvincing. Furthermore, based on the literature on customer support chatbots, 54 argue that comfortably conversing on any topic related to the service offering is a requirement for task-oriented chatbots. 26 reviewed popular chatbots for practicing second language. The author showed that most chatbots in this field cannot hold continuous conversations, since they are developed to answer the user’s last input. Therefore, they did not have the sense of topic, which resulted in instances of inappropriate responses. When the chatbots could change the topic, they could not sustain it afterward. Showing conscientiousness also requires the chatbot to understand and track the context, which is particularly important in task-oriented scenarios. In 66, first-time users stressed positive experience with chatbots that retained information from previous turns. Two participants also expected the chatbots to retain this context across sessions, thus reducing the need for extra user’s input per interaction. Keeping the context across sessions was highlighted as a strategy to convey personalization and empathy (see Sections 3.2.6 and 3.2.5).
[B3] to steer the conversation toward a productive direction: in task-oriented interactions, a chatbot should understand the purpose of the interaction and strive to conduct the conversation toward this goal in an efficient, productive way 41; 9. 16 show that productivity is the key motivation factor for using chatbots (68% of the participants mentioned it as the main reason for using chatbots). First-time users in 66 self-reported that interacting with chatbots should be more productive than using websites, phone apps, and search engines. In this sense, 41 compared the user experience when interacting with a chatbot for solving either simple or complex tasks in a financial context. The authors found that, for complex tasks, to keep the conversation on track, the user must be aware of the next steps or why something is happening. In the educational context, 9 proposed a dialogue management approach based on communication strategies to enrich a chatbot with the capability to express its meaning when faced with difficulties. Statistical results show that the communication strategies, combined with affective backchannel (which is detailed in Section 3.2.5), are effective in motivating students to communicate and maintain the task flow. Thirty-two participants out of 40 reported that they preferred to interact with a chatbot with these characteristics. Noticeably, the chatbot’s attentiveness to the interactional goal may not be evident to the user if the chatbot passively waits for the user to control the interaction. Thus, conscientiousness relates to proactive ability, as discussed in Section 3.1.1.
Nevertheless, challenges in designing conscientious chatbots are also evident in the literature:
[C1] to handle task complexity: as the complexity of tasks increases, more turns are required to achieve a goal; hence, more mistakes may be made. This argument was supported by both 41 and 43, where the complexity of the task compromised the experience and satisfaction in using the chatbot. 41 also highlight that complex tasks require more effort to correct eventual mistakes. Therefore, it is an open challenge to design flexible workflows, where the chatbot recovers from failures and keeps the interaction moving productively toward the goal, despite potential misunderstandings 54. Recovering from failure is discussed in Section 3.2.1.
[C2] to harden the conversation: aiming to assure the conversational structure–and to hide natural language limitations–chatbots are designed to restrict free-text inputs from the user 41; 66; 127. However, limiting the choices of interaction may convey a lack of attention to the users’ inputs. In 41, one participant mentioned the feeling of “going through a form or a fixed menu.” According to 66, participants consider the chatbot’s understanding of free-text input as a criterion to determine whether it can be considered a chatbot, since chatbots are supposed to chat. In the ethnographic data collection study, 127 reported that eight out of ten participants described the interaction using pre-set responses as too restrictive, although they fulfilled the purpose of nudging participants to report their activities. Thus, the challenge lies in how to leverage the benefits of suggesting predefined inputs without limiting conversational capabilities.
[C3] to keep the user aware of the chatbot’s context: a chatbot should provide a way to inform the user of the current context, especially for complex tasks. According to 65, context can be inferred from explicit user input, or assumed based data from previous interactions. In both cases, user and chatbot should be on the same page about the chatbot’s contextual state 65, giving the users the opportunity to clarify possible misunderstandings 54. 66 highlighted that participants reported negative experience when finding “mismatching between chatbot’s real context and their assumptions of the chatbot context.”
We identified three strategies used to provide understanding to a chatbot, as following:
[S1] conversation workflow: designing a conversational blueprint helps to conduct the conversation strictly and productively to the goal 41. However, 54 argue that the workflow should be flexible to handle both multi-turn and one-turn, question-answer interactions; besides, it should be unambiguous in order for users to efficiently achieve their goals. In addition, 41 discuss that the workflow should make it easy to fix mistakes; otherwise, the users need to restart the workflow, which leads to frustration. In 9, the conversation workflow included communicative strategies to detect a learner’s breakdowns and pitfalls. In that study, when the student does not respond, the chatbot uses a comprehension-check question to detect whether the student understood what was said. Then, it reacts to the user’s input by adopting one of the proposed communication strategies (e.g., asking for repetition or simplifying the previous sentence). The conversation workflow could also allow the chatbot to be proactive. For example, participants in 66 suggested that proactive follow-up questions would anticipate the resolution of the context, reducing the effort required from the user to achieve the goal.
[S2] visual elements: user-interface resources–such as quick replies, cards, and carousels–are used to structure the conversation and reduce issues regarding understanding 41; 66; 127. Using these resources, the chatbot shows the next possible utterances 66 and conveys the conversational workflow step-by-step 41; 127. Visual elements are also used to show the user what the chatbot can (or cannot) do. This is another conversational characteristic, which will be discussed in the Section 3.1.3.
[S3] context window: to keep the user aware of the current chatbot’ context, 65 developed a chatbot for shopping that shows a context window on the side of the conversation. In this window, the user can click on specific attributes and change them to fix inconsistencies. A survey showed that the chatbot outperformed a default chatbot (without the context window) for the mental demand and effort constructs. However, when the chatbots are built in third-party apps (e.g., Facebook Messenger), an extra window may not be possible.
[S4] confirmation messages: a conversation workflow may include confirmation messages to convey the chatbots’ context to the user 65. In 41, when trying to block a stolen credit card, a confirmation message is used to verify the given personal data. In 9, confirmation messages are used as a communicative strategy to check whether the system understanding about a particular utterance matches what the learner actually meant. Balancing the number of confirmation messages (see 41
) and the right moment to introduce them into the conversation flow is still under-investigated.
The surveyed literature supports that failing to demonstrate understanding about the users’ individual utterances, the conversational context, and the interactional goals results in frustration and loss of credibility. However, most of the results are exploratory findings; there is a lack studies to investigate the extent to which the provided strategies influence users’ behavior and perceptions. In addition, conscientiousness is by itself a personality trait; the more conscientiousness a chatbot manifests, the more it can be perceived as attentive, organized, and efficient. The relationship between conscientiousness and personality is highlighted in Section 4.
Interactive software is communicative by its nature, since users achieve their goals by exchanging messages with the system 126; 105. In this context, communicability is defined as the capacity of a software to convey to users its underlying design intent and interactive principles 105. Providing communicability helps users to interpret the codes used by designers to convey the interactional possibilities embedded in the software 37, which improves system learnability 58. In the chatbot context, communicability is, therefore, the capability of a chatbot to convey its features to users 138. The problematic around chatbots’ communicability lies in the nature of the interface: instead of buttons, menus, and links, chatbots unveil their capabilities through the conversational turns, one sentence at a time 138, bringing new challenges in the system learnability field. The lack of communicability may lead users to give up on using the chatbot when they cannot understand the available functionalities and how to use them 138.
In this survey, we found six papers that describes communicability, although investigating communicability is the main purpose of only one 138. Conversational logs revealed communicability needs in three studies 77; 73; in the other three studies 66; 54; 41 communicability issues were self-reported by the users in post-intervention interviews and subjective feedback.
The surveyed literature reports two main benefits of communicability for chatbots:
[B1] to unveil functionalities: while interacting with chatbots, users may not know that a desired functionality is available or how to use it 66; 138. Most participants in 66’s study mentioned that they did not understand the functionalities of at least one of the chatbots and none of them mentioned searching for the functionalities in other sources (e.g., Google search or the chatbot website) rather than exploring options during the interaction. In a study about playful interactions in a work environment, 77 observed that 22% of the participants explicitly asked the chatbot about its capabilities (e.g., “what can you do?”), and 1.8% of all the users’ messages were ability-check questions. In a study about hotel chatbots, 73 verified that 63% of the conversations were initiated by clicking an option displayed in the chatbot welcome message. A semiotic inspection of news-related chatbots 138 evidenced that communicability strategies are effective in providing clues about the chatbot’s features and ideas about what to do and how.
[B2] to manage users’ expectations: 66 observed that when first-time users do not understand chatbots’ capabilities and limitations, they have high expectations and, consequently, end up more frustrated when the chatbots fail. Some participants blamed themselves for not knowing how to communicate and gave up. In 77, quantitative results evidenced that ability-check questions can be considered signals of users struggling with functional affordances. Users posed ability-check questions after encountering errors as a means of establishing a common ground between the chatbot’s capabilities and their own expectations. According to the authors, ability-check questions helped users to understand the system and reduce uncertainty 77. In 41, users also demonstrated the importance of understanding chatbots’ capabilities in advance. Since the tasks related to financial support, users expected the chatbot to validate the personal data provided and to provide feedback after completing the task (e.g., explaining how long it would take for the credit card to be blocked). Therefore, communicability helps users to gain a sense of which type of messages or functionalities a chatbot can handle.
The surveyed literature also highlights two challenges of providing communicability:
[C1] to provide business integration: communicated chatbots’ functionalities should be performed as much as possible within the chat interface 66. Chatbots often act as an intermediary between users and services. In this case, to overcome technical challenges, chatbots answer users’ inputs with links to external sources, where the request will be addressed. First-time users expressed dissatisfaction with this strategy in 66. Six participants complained that the chatbot redirected them to external websites. According to 54, business integration is a requirement for designing chatbots, so that the chatbot can solve the users’ requests without transferring the interaction to another user interface.
[C2] to keep visual elements consistent with textual inputs: in the semiotic engineering evaluation, 138 observed that some chatbots responded differently depending on whether the user accesses a visual element in the user-interface or types the desired functionality in the text-input box, even if both input modes result in the same utterance. This inconsistency produces the user’s feeling of misinterpreting the affordances, which has a negative impact on the system learnability.
As an outcome of the semiotic inspection process, 138 present a set of strategies to provide communicability. Some of them are also emphasized in other studies, as follows:
[S1] to clarify the purpose of the chatbot: First-time users in 66 highlighted that a clarification about the chatbots’ purpose should be placed in the introductory message. 54 found similar inference from the literature on customer services chatbots. The authors argue that providing an opening message with insights into the chatbots’ capabilities while not flooding the users with unnecessary information is a requirement for chatbots design. In addition, a chatbot could give a short tour throughout the main functionalities at the beginning of the first sessions 138.
[S2] to advertise the functionality and suggest the next step: when the chatbot is not able to answer the user, or when it notices that the user is silent, it may suggest available features to stimulate the user to engage 66; 138. In 66, six participants mentioned that they appreciated when the chatbot suggested responses, for example by saying “try a few of these commands: …” 66. 138 shows that chatbots use visual elements, such as cards, carousel, and menus (persistent or not) to show contextualized clues about the next answer, which both fulfills communicability purpose and spares users from having to type.
[S3] to provide a help functionality: chatbots should recognize a “help” input from the user, so it can provide instructions on how to proceed 138. 66 reported users that highlighted this functionality as useful for the reviewed chatbots. Also, results from 77 shows that chatbots should be able to answer ability checking questions (e.g., “what can you do?” or “can you do [functionality]?”).
The literature states the importance of communicating chatbots’ functionality to the success of the interaction. Failing to provide communicability leads the users to frustration and they often give up when they do not know how to proceed. The literature on interactive systems has highlighted the system learnability as the most fundamental component for usability 58, and an easy-to-learn system should lead the user to perform well, even during their initial interactions. Thus, researchers in the chatbots domain can leverage the vast literature on systems learnability to identify metrics and evaluation methodologies, as well as proposing new forms of communicability strategies that reduce the learnability issues in chatbots interaction. Communicability may also be used as a strategy to avoid mistakes (damage control), which will be discussed in Section 3.2.1.
3.2. Social Intelligence
Social Intelligence refers to the ability of an individual to produce adequate social behavior for the purpose of achieving desired goals 13. In the HCI domain, the Media Equation theory 110 posit that people react to computers as social actors. Hence, when developing chatbots, it is necessary to account for the socially acceptable protocols for conversational interactions 142; 140. Chatbots should be able to respond to social cues during the conversation, accept differences, and manage conflicts 115 as well as be empathic and demonstrate caring 13, which ultimately increase chatbots’ authenticity 95. In this section, we discuss the social characteristics related to social intelligence, namely: damage control (12 papers), thoroughness (13 papers), manners (10 papers), moral agency (6 papers), emotional intelligence (13 papers), and personalization (11 papers). Although the focus of the investigations is diverse, we found more studies where the focus of the investigation relates to a specific social characteristic, particularly moral agency and emotional intelligence, when compared to the conversational intelligence category.
3.2.1. Damage control
Damage control is the ability of a chatbot to deal with either conflict or failure situations. Although the Media Equation theory argues that humans socially respond to computers as they respond to other people 110, the literature has shown that interactions with conversational agents are not quite equal to human-human interactions 78; 90; 120. When talking to a chatbot, humans are more likely to harass 62, test the agent’s capabilities and knowledge 142, and feel disappointed with mistakes 83; 66. When a chatbot does not respond appropriately, it may encourage the abusive behavior 29 or disappoint the user 66; 83, which ultimately leads the conversation to fail 66. Thus, it is necessary to enrich chatbots with the ability to recover from failures and handle inappropriate talk in a socially acceptable manner 142; 66; 123.
In this survey, we found 12 studies that discuss damage control as a relevant characteristic for chatbots, two of which focus on conflict situations, such as testing and flaming 142; 123. In the remaining studies 77; 41; 135; 83; 73; 29; 54; 66; 67; 34, needs for damage control emerged from the analysis of conversational logs and users’ feedback.
The surveyed literature highlights the following benefits of providing damage control in chatbots:
[B1] to appropriately respond to harassment: chatbots are more likely to be exposed to profanity than humans would be 62. When analyzing conversation logs from hotel chatbots, 73 observed that 4% of the conversations contained vulgar, indecent, and insulting vocabulary, and 2.8% of all statements were abusive. Qualitative evaluation reveals that the longer the conversations last, the more users are encouraged to go beyond the chatbots main functions. In addition, sexual expressions represented 1.8% of all statements. The researchers suggested that having a task-oriented conversation with a company representative and not allowing small talk contributed to inhibiting the users to use profanity. However, a similar number was found in a study with general purpose chatbots 29
. When analyzing a corpus from the Amazon Alexa Prize 2017, the researchers estimated that about 4% of the conversations included sexually explicit utterances.29 used utterances from this corpus to harass a set of state-of-art chatbots and analyze the responses. The results show that chatbots respond to harassment in a variety of ways, including nonsensical, negative, and positive responses. However, the authors highlight that the responses should align with the chatbot’s goal to avoid encouraging the behavior or reinforcing stereotypes.
[B2] to deal with testing: abusive behavior is often used to test chatbots’ social reactions 142; 73. During the evaluation of a virtual guide to the university campus 142, a participant answered the chatbot’s introductory greeting with “moron,” likely hoping to see how the chatbot would answer. 142 argue that handling this type of testing helps the chatbots to establish limits and resolve social positioning. Other forms of testing were highlighted in 123, including sending random letters, repeated greetings, laughs and acknowledgments, and posing comments and questions about the chatbot’s intellectual capabilities. When analyzing conversations with a task management chatbot, 77 observed that casually testing the chatbots’ “intelligence” is a manifestation of seeking satisfaction. In 66, first-time users appreciated when the chatbot successfully performed tasks when the user expected the chatbot to fail, which shows that satisfaction is influenced by the ability to provide a clever response when the user tests the chatbot.
[B3] to deal with lack of knowledge: chatbots often fail in a conversation due to lack of either linguistic or world knowledge 142. Damage control enables the chatbot to admit the lack of knowledge or cover up cleverly 66. When analyzing the log of a task management chatbot, 135 found out that the chatbot failed to answer 10% of the exchanged messages. The authors suggest that the chatbot should be designed to handle novel scenarios when the current knowledge is not enough to answer the requests. In some task-oriented chatbots, though, the failure may not be caused by a novel scenario, but by an off-topic utterance. In the educational context, 123 observed that students posted off-topic utterances when they did not know what topics they could talk about, which led the chatbot to fail rather than help the users to understand its knowledge. In task-oriented scenarios, the lack of linguistic knowledge may lead the chatbots to get lost in the conversational workflow 54, compromising the success of the interaction. 83 demonstrated that dialogue-reference errors (e.g., user’s attempt to correct a previous answer or jumping back to an earlier question) are one of the major reasons for failing dialogues and they mostly resulted from chatbots misunderstandings.
The literature also reveals some challenges to provide damage control:
[C1] to deal with unfriendly users: 123 argue that users that want to test and find the system’s borders are likely to never have a meaningful conversation with the chatbot no matter how sophisticated it is. Thus, there is an extent to which damage control strategies will be effective to avoid testing and abuse. In 34, the authors observed human tendencies to dominate, be rude, and infer stupidity, which they call “unfriendly partners.” After an intervention where users interacted with a chatbot for decision-making coaching, 83 evaluated participants’ self-perceived work and cooperation with the system. The qualitative results show that cooperative users are significantly more likely to give a higher rating for overall evaluation and decision efficiency. The qualitative analysis of the conversation log reveals that a few interactions failed because the users’ motivations were curiosity and mischief rather than trying to solve the decision problem.
[C2] to identify abusive utterances: several chatbots are trained on “clean” data. Because they do not understand profanity or abuse, they may not recognize a statement as harassment, which makes it difficult to adopt answering strategies 29. 29 shows that data-driven chatbots often provide non-coherent responses to harassment. Sometimes, these responses conveyed the impression of flirtatious or counter-aggression. Providing means to identify an abusive utterance is important to adopt damage control strategies.
[C3] to adequate the response to the context: 142 argue that humans negotiate a conflict and social positioning well before reaching abuse. In human-chatbot interactions, however, predicting users’ behavior toward the chatbots in a specific context to develop the appropriate behavior is a challenge to overcome. Damage control strategies need to be adapted to both the social situation and the intensity of the conflict. For example, 29 showed that being evasive about sexual statements may convey the impression of flirtatious, which would not be an acceptable behavior for a customer assistant or a tutor chatbot. In contrast, adult chatbots are supposed to flirt, so encouraging behaviors are expected in some situations. 142 argue that when the chatbot is not accepted as part of the social group it represents, it is discredited by the user, leading the interaction to fail. In addition, designing chatbots with too strong reactions may lead to ethical concerns 142. For 13, choosing between peaceful or aggressive reactions in conflict situations is optional for socially intelligent individuals. Enriching chatbots with the ability to choose between the options is a challenge.
Damage control strategies depend on the type of failure and the target benefit, as following:
[S1] emotional reactions: 142 suggest that when faced with abuse, a chatbot could be seen to take offense and respond in kind or to act hurt. The authors argue that humans might feel inhibited about hurting the pretended feelings of a machine if the machine is willing to hurt human’s feelings too 142. If escalating the aggressive behavior is not appropriate, the chatbot could withdraw from the conversation 142 to demonstrate that the user’s behavior is not acceptable. In 34, the authors discuss that users appeared to be uncomfortable and annoyed whenever the chatbot pointed out any defect in the user or reacted to aggression, as this behavior conflict the user’s perceived power relations. This strategy is also applied in 123, where abusive behavior may lead the chatbot to stop responding until the student changes the topic. 29 categorized responses from state-of-the-art conversational systems in a pool of emotional reactions, both positive and negative. The reactions include humorous responses, chastising and retaliation, and evasive responses as well as flirtation, and play-along utterances. To provide an emotional reaction, emotional intelligence is also required. This category is presented in Section 3.2.5.
[S2] authoritative reactions: when facing testing or abuse, chatbots can communicate consequences 123 or call for the authority of others 142; 135. In 142, although the wizard acting as a chatbot was conscientiously working as a campus guide, she answered a bogus caller with “This is the University of Melbourne. Sorry, how can I help you?” The authors suggest that the wizard was calling on the authority of the university to handle the conflict, where being part of a recognized institution places the chatbot in a stronger social group. In 123, when students recurrently harass the chatbot, the chatbot informs the student that further abuse will be reported to the (human) teacher (although the paper does not clarify whether the problem is, in fact, escalated to a human). 135 and 67 also suggest that chatbots could redirect users’ problematic requests to a human attendant in order to avoid conflict situations.
[S3] to ignore the user’s utterance and change the topic: 142 argue that ignoring abuse and testing is not a good strategy because it could encourage more extreme behaviors. It also positions the chatbot as an inferior individual, which is particularly harmful in scenarios where the chatbot should demonstrate a more prominent or authoritative social role (e.g., a tutor). However, this strategy has been found in some studies to handle lack of knowledge. When iteratively developing a chatbot for an educational context, 123 proposed to initiate a new topic in one out of four user’s utterances that the chatbot did not understand.
[S4] conscientiousness and communicability: successfully implementing conscientiousness and communicability may prevent errors; hence, strategies to provide these characteristics can also be used for damage control. In 123, when users utter out-of-scope statements, the chatbot could make it clear what topics are appropriate to the situation. For task-oriented scenarios, where the conversation should evolve toward a goal, 142 argue that the chatbot can clarify the purpose of the offered service when facing abusive behavior, bringing the user back to the task. 66 showed that describing chatbot’s capabilities after failures in the dialog was appreciated by first-time users. In situations where the conversational workflow is susceptible to failure, 54 discuss that posing confirmation messages avoids trapping the users in the wrong conversation path. Participants in 41 also suggested back buttons as a strategy to fix mistakes in the workflow. In addition, the exploratory results about the user interface showed that having visual elements such as quick replies prevent errors, since they keep the users aware of what to ask and the chatbot is more likely to know how to respond 41.
[S5] to predict users’ satisfaction: chatbots should perceive both explicit and implicit feedback about users’ (dis)satisfaction 77. To address this challenge, 77 invited participants to send a “#fail” statement to express dissatisfaction. The results show that 42.4% of the users did it at least once, and the number of complaints and flaming for the proposed chatbot was significantly lower than the baseline. However, the amount of implicit feedback was also significant, which advocates for predicting user’s satisfaction from the conversation. The most powerful conversational acts to predict user satisfaction in that study was the agent ability-check types of questions, see discussion in Communicability section) and the explicit feedback #fail, although closings and off-topic requests were also significant in predicting frustration. Although these results are promising, more investigation is needed to identify other potential predictors of users’ satisfaction in real-time, in other to provide appropriate reaction.
Damage control strategies have different levels of severity. Deciding what strategy is adequate to the intensity of the conflict is crucial 142. The strategies can escalate in severity in case that the conflict is not solved. For example, 123 uses a sequence of clarification, suggesting a new topic, and asking a question about the new topic. In case of abuse, the chatbot calls for authority after two attempts of changing topics.
According to 142, humans also fail in conversations; they misunderstand what their partner says and do not know things that are assumed as common knowledge by others. Hence, it is unlikely that chatbots interactions will evolve to be conflict-free. That said, damage control intends to avoid escalating the conflicts and manifest an unexpected behavior. In this sense, politeness can be used as a strategy to minimize the effect of lack of knowledge (see Section 3.2.3), managing the conversation despite the possible mistakes. Regarding interpersonal conflicts, the strategies are in line with the theory on human-human communication, which includes non-negotiation, emotional appeal, personal rejection, and emphatic understanding 50. Further research on damage control can evaluate the adoption of human-human strategies in human-chatbots communication.
Thoroughness is the ability of a chatbot to be precise regarding how it uses language to express itself 89. In traditional user interfaces, user communication takes place using visual affordances, such as buttons, menus, or links. In a conversational interface, language is the main tool to achieve the communicative goal. Thus, chatbots should coherently use language that portrays the expected style 79. When the chatbot is not consistent about how they use language, or use unexpected patterns of language (e.g., excessive formality), the conversation may sound strange to the user, leading to frustration. We found 13 papers that report the importance of thoroughness in chatbots design, three of which investigate how patterns of language influence users’ perceptions and behavior toward the chatbots 41; 79; 62. 54 and 88 suggest design principles that include concerns about language choices. Log of conversations revealed issues regarding thoroughness in two studies 67; 26. In the remaining papers, thoroughness emerged from interviews and users’ subjective feedback 150; 89; 69; 127; 24; 131.
We found two benefits of providing thoroughness:
[B1] to adapt the language dynamically: chatbots utterances are often pre-recorded by the chatbot designer 79. On the one hand, this approach produces high quality utterances; on the other hand, it reduces flexibility since the chatbot is not able to adapt the tone of the conversation based on individual users and conversational context. When analyzing interactions with a customer representative chatbot, 67 observed that the chatbot proposed synonyms to keywords, and the repetition of this vocabulary led the users to imitate it. 62 observed a similar tendency to matching language style. The authors compared human-human conversations with human-chatbots conversations regarding language use. They found that people use, indeed, fewer words per message and a more limited vocabulary with chatbots. However, a deeper investigation revealed that the human-interlocutors were actually matching the patterns of language use with the chatbot, who sent fewer words per message. When interacting with a chatbot that uses many emojis and letter reduplication 131, participants reported a draining experience, since the chatbot’s energy was too high to match with. These outcomes show that adapting the language to the interlocutor is a common behavior for humans, and so chatbots would benefit from manifesting it. In addition to the interlocutor, chatbots should adapt their language use to the context which they are implemented and adopt appropriate linguistic register 89; 54. In the customer services domain, 54 state that chatbots are expected to fulfil the role of a human, hence, they should produce language that corresponds to the represented service provider. In the financial scenario 41, some participants complained about the use of emojis in a situation of urgency (blocking a stolen credit card).
[B2] to exhibit believable behavior: because people associate social qualities to machines 110, chatbots are deemed to be below standard when users see them “acting as a machine” 67. When analyzing the naturalness of chatbots, 89 found that the formal grammatical and syntactical abilities of a chatbot are the biggest discriminators between good and poor chatbots (the other factors being conscientiousness, manners, and proactivity). The authors highlight that chatbots should use grammar and spelling consistently. 26 discusses that, even with English as Second Language (ESL) learners, basic grammar errors, such as pronouns confusion, diminish the value of the chatbot. In addition, 88 states that believable chatbots need also to display unique characters through linguistic choices. In this sense, 79 demonstrated that personality can be expressed by language patterns. The authors proposed a computational framework to produce utterances to manifest a target personality. The utterances were rated by experts in personality evaluation and statistically compared against utterances produced by humans who manifest the target personality. The outcomes show that a single utterance can manifest a believable personality when using the appropriate linguistic form. Participants in 67 described some interactions as “robotic” when the chatbot repeated the keywords in the answers, reducing the interaction naturalness. Similarly, in 127, participants complained about the “inflexibility” of the pre-defined, handcrafted chatbot’s responses and expressed the desire for it to talk “more as a person.”
Regarding the challenges, the surveyed literature shows the following:
[C1] to decide on how much to talk: in 67, some participants described the chatbot’s utterances as not having enough details, or being too generic; however, most of them appreciated finding answers in a sentence rather than in a paragraph. Similarly, 54 argue that simple questions should not be too detailed while important transactions require more information. In three studies 24; 150; 41, participants complained about information overload and inefficiency caused by big blocks of texts. Balancing the granularity of information with the sentence length is a challenge to be overcome.
[C2] to be consistent: chatbots should not combine different language styles. For example, in 41, most users found it strange that emojis were combined with a certain level of formal contact. When analyzing the critical incidents about an open-domain interaction, 69 found that participants criticized when chatbots used more formal language or unusual vocabulary since general-purpose chatbots focus on casual interactions.
Despite the highlighted benefits, we did not find strategies to provide thoroughness. 88 proposed a rule-based architecture where the language choices consider the agent’s personality, emotional state, and beliefs about the social relationship among the interlocutors. However, they did not provide evidence of whether the proposed models produced the expected outcome. Although the literature in computational linguistics has proposed algorithms and statistical models to manipulate language style and matching (see e.g., 104; 152), to the best of our knowledge, these strategies have not been evaluated in the context of chatbots social interactions.
This section shows that linguistic choices influence users’ perceptions of chatbots. Computer-mediated communication (CMC) field has a vast literature that shows language variation according to the media and its effect in social perceptions (see e.g. 145; 12). Similarly, researchers in sociolinguistic fields 27 have shown that language choices are influenced by personal style, dialect, genre, and register. For chatbots, the results presented in 79 are promising, demonstrating that automatically generated language can manifest recognizable traits. Thus, further research in chatbot’s thoroughness could leverage CMC and sociolinguistics theories to provide strategies that lead language to accomplish its purpose for a particular interactional context.
Manners refer to the ability of a chatbot to manifest polite behavior and conversational habits 89. Although individuals with different personalities, from different cultures may have different notions of what is considered polite (see e.g., 147), politeness can be more generally applied as rapport management 18, where interlocutors strive to control the harmony between people in discourse. A chatbot can manifest manners by adopting speech acts such as greetings, apologies, and closings 66; minimizing impositions 127; 135, and making interactions more personal 66. Manners potentially reduces the feeling of annoyance and frustration that may lead the interaction to fail 66.
We identified ten studies that report manners, one of which directly investigate this characteristic 142. In some studies 135; 24; 77; 83, manners were observed in the analysis of conversational logs, where participants talked to the chatbot in polite, human-like ways. Users’ feedback and interviews revealed users expectations regarding chatbots politeness and personal behavior 66; 67; 89; 69; 71.
The main benefit of providing manners is [B1] to increase human-likeness. Manners is highlighted in the literature as a way to turn chatbots conversations into a more natural, convincing interaction 69; 89. In an in-the-wild data collection, 135 observed that 93% of the participants used polite words (e.g., “thanks” or “please”) with a task management chatbot at least once, and 20% always talked politely to the chatbot. Unfortunately, the chatbot evaluated in that study was not prepared to handle these protocols and ultimately failed to understand. When identifying incidents from their own conversational logs with a chatbot 69, several participants identified greetings as a human-seeming characteristic. The users also found convincing when the chatbot appropriately reacts to social cues statements, such as “how are you?”-types of utterances. Using this result, 89 later suggested that greetings, apologies, social niceties, and introductions are significant constructs to measure chatbot’s naturalness. In 67, the chatbot used exclamation marks at some points and frequently offered sentences available on the website, vaguely human-like. In the feedback, participants described the chatbot as rude, impolite, and cheeky.
The surveyed literature highlights two challenges to convey manners:
[C1] to deal with face-threatening acts: Face-Threatening Acts (FTA) are speech acts that threaten, either positive or negatively, the “face” of an interlocutor 19. Politeness strategies in human-human interactions are adopted to counteract the threat when an FTA needs to be performed 19. In 142, the authors discuss that the wizard performing the role of the chatbot used several politeness strategies to counteract face threats. For instance, when she did not recognize a destination, instead of providing a list of possible destinations, she stimulated the user to keep talking until they volunteered the information. In chatbots design, in contrast, providing a list of options to choose is a common strategy. For example, in 135, the chatbot was designed to present the user with a list of pending tasks when it did not know what task the user was reporting as completed, although the authors acknowledged that it resulted in an unnatural interaction. Although adopting politeness strategies is natural for humans and people usually do not consciously think about them, implementing them for chatbots is challenging due to the complexity of identifying face-threatening acts. For example, in the decision-making coach scenario, 83 observed that users tend to utter straightforward and direct agreements while most of the disagreements contained modifiers that weakened their disagreement. The adoption of politeness strategies to deal with face-threatening acts is still under-investigated in the chatbots literature.
[C2] to end a conversation gracefully: 66 discuss that first-time users expected human-like conversational etiquette from the chatbots, specifically introductory phrases and concluding phrases. Although several chatbots perform well in the introduction, the concluding phrases are less explored. Most of the participants reported being annoyed with chatbots that do not end a conversation 66. 24 also highlight that chatbots need to know when the conversation ends. In that scenario, the chatbot could recognize a closing statement (the user explicitly says “thank you” or “bye”); however, it would not end the conversation otherwise. Users that stated a decision, but kept receiving more information from the chatbot, reported feeling confused and undecided afterward. Thus, recognizing the right moment to end the conversation is a challenge to overcome.
The strategies highlighted in the surveyed literature for providing manners are the following:
[S1] to engage in small talk: 77 and 71 point out that even task-oriented chatbots engage in small talk. When categorizing the utterances from the conversational log, the authors found a significant number of messages about the agent status (e.g., “what are you doing?”), opening and closing sentences as well as acknowledgment statements (“ok,” “got it”). 66 also observed that first-time users included small talk in the introductory phrases. According to 77, these are common behaviors in human-human chat interface, and chatbots would likely benefit from anticipating these habitual behaviors and reproducing them. However, particularly for task-oriented chatbots, it is important to control the small talk to avoid off-topic conversations and harassment, as discussed in Sections 3.2.1 and 3.1.2.
[S2] to adhere to turn-taking protocols: 135 suggest that chatbots should adopt turn-taking protocols to know when to talk. Participants who received frequent follow-up questions from the task management chatbot about their pending tasks perceived the chatbot as invasive. Literature in chatbot development proposes techniques to improve chatbots’ turn-taking capabilities (see e.g., 19; 36; 22), which can be explored as a mean of improving perceived chatbot’s manners.
Although the literature emphasizes that manners are important to approximate chatbots interactions to human conversational protocols, this social characteristic is under-investigated in the literature. Conversational acts such as greetings and apologies are often adopted (e.g., 66; 67; 83), but there is a lack of studies on the rationality around the strategies and the relations with politeness models used in human-human social interactions 142. In addition, the literature points out needs for personal conversations (e.g., addressing the users by name), but we did not find studies that focus on this type of strategies. CMC is by itself more impersonal than face-to-face conversations 144; 143; even so, current online communication media has been successfully used to initiate, develop, and maintain interpersonal relationships 146. Researchers can learn from human behaviors in CMC and adopt similar strategies to produce more personal conversations.
3.2.4. Moral agency
Machine moral agency refers to the ability of a technology to act based on social notions of right and wrong 11. The lack of this ability may lead to cases such as Tay, the Microsoft’s Twitter chatbot that became racist, sexist, and harasser in a few hours 94. The case raised concerns on what makes an artificial agent (im)moral. Whether machines can be considered (moral) agents is widely discussed in the literature (see e.g., 63; 4; 99). In this survey, the goal is not to argue about criteria to define a chatbot as moral, but to discuss the benefits of manifesting a perceived agency 11 and the implications of disregarding chatbots’ moral behavior. Hence, for the purpose of this survey, moral agency is a manifested behavior that may be inferred by a human as morality and agency 11.
We found six papers that address moral agency.11 developed and validated a metric for perceived moral agency in conversational interfaces, including chatbots. In four studies, the authors investigated the ability of chatbots to handle conversations where the persistence of gender 33; 15 and race stereotypes 80; 117 may occur. In 122, moral agency is discussed as a secondary result, where the authors discuss the impact of generating biased responses on emotional connection.
The two main reported benefits of manifesting perceived moral agency are the following:
[B1] to avoid stereotyping: chatbots are often designed with anthropomorphized characteristics (see Section 3.3), including gender, age, and ethnicity identities. Although the chatbot’s personification is more evident in embodied conversational agents, text-based chatbots may also be assessed by their social representation, which risks building or reinforcing stereotypes 80. 80 and 117 argues that chatbots are often developed using language registers 80 and cultural references 117 of the dominant culture. In addition, a static image (or avatar) representing the agent may convey social grouping 97. When the chatbot is positioned in a minority identity group, it exposes the image of that group to the judgment and flaming, which is frequent in chatbots interactions 80. For example, 80 discusses the controversies caused by a chatbot designed to answer questions about Caribbean Aboriginals culture: its representation as a Caribbean Amerindian individual created an unintended context for stereotyping, where users projected the chatbot’s behavior as a standard for people from the represented population. Another example is the differences in sexual discourse between male- and female-presenting chatbots. 15 found that female-presenting chatbots are the object of implicit and explicit sexual attention and swear words more often than male-presenting chatbots. 33 show that sex talks with the male chatbot were rarely coercive or violent; his sexual preference was often questioned, though, and he was frequently propositioned by reported male users. In contrast, the female character received violent sexual statements, and she was threatened with rape five times in the analyzed corpora. In 15, when the avatars were presented as black adults, references to race often deteriorated into racist attacks. Manifesting moral agency may, thus, prevent from obnoxious user interactions. In addition, moral agency may prevent the chatbot itself from being biased or disrespectful to humans. 117 argue that the lack of context about the world does not redeem the chatbot from the necessity of being respectful with all the social groups.
[B2] to enrich interpersonal relationships: in a study on how interlocutors perceive conversational agents’ moral agency, 11 hypothesized that perceived morality may influence a range of motivations, dynamics, and effects of human-machine interactions. Based on this claim, the authors evaluated whether goodwill, trustworthiness, willingness to engage, and relational certainty in future interactions are constructs to measure perceived moral agency. Statistical results showed that all the constructs correlate with morality, which suggests that manifesting moral agency can enrich interpersonal relationship with chatbots. Similarly, 122 suggest that to produce interpersonal responses, chatbots should be aware of inappropriate information and avoid generating biased responses.
However, the surveyed literature also reveals challenges of manifesting moral agency:
[C1] to avoid alienation: in order to prevent a chatbot from reproducing hate-speech, or abusive talk, most chatbots are built over “clean” data, where specific words are removed from their dictionary 117; 33. These chatbots have no knowledge of those words and their meaning. Although this strategy is useful to prevent unwanted behavior, it does not manifest agency, but alienates the chatbot of the topic. 33 show that the lack of understanding about sex-talk does not prevent the studied chatbot from harsh verbal abuse, or even from being encouraging. From 117, one can notice that the absence of racist specific words did not prevent the chatbot Zo from uttering discriminatory exchanges. As a consequence, manifesting moral agency requires a border understanding of the world rather than alienation, which is still an open challenge.
[C2] to build unbiased algorithms and training data: as extensively discussed in 117
, machine learning algorithms and corpus-based language generation are biased toward the available training datasets. Hence,moral agency relies on data that is biased in its nature, producing unsatisfactory results from an ethical perspective. In 122, the authors propose a framework for developing social chatbots. The authors highlight that the core-chat module should follow ethical design to generate unbiased, non-discriminative responses, but they do not discuss specific strategies for that. Building unbiased training datasets and learning algorithms that connect the outputs with individual, real-world experiences, therefore, are challenges to overcome.
Despite the relevance of moral agency to the development of socially intelligent chatbots, we did not find strategies to address the issues. 117 advocate for developing diversity-conscious databases and learning algorithms that account for ethical concerns; however, the paper focuses on outlining the main research branches and call the community of designers to adopt new strategies. As discussed in this section, research on perceived moral agency is still necessary, in order to develop chatbots whose social behavior is inclusive and respectful.
3.2.5. Emotional Intelligence
Emotional intelligence is a subset of social intelligence that allows an individual to appraise and express feelings, regulate affective reactions, and harness the emotions to solve a problem 115. Although chatbots do not have genuine emotions 142, there are considerable discussions about the role of manifesting (pretended) emotions in chatbots 122; 142; 64. An emotionally intelligent chatbot can recognize and control users’ feelings and demonstrate respect, empathy, and understanding, improving the relationship between them 115; 75.
We identified 13 studies that report emotional intelligence. Unlikely the previously discussed categories, most studies on emotional intelligence focused on understanding the effects of chatbots’ empathy and emotional self-disclosure 9; 71; 49; 40; 64; 122; 87; 88; 102; 74. Only three papers highlighted emotional intelligence as an exploratory outcome 131; 67; 150, where needs for emotional intelligence emerged from participants subjective feedback and post-intervention surveys.
The main reported benefits of developing emotionally intelligent chatbots are the following:
[B1] to enrich interpersonal relationships: the perception that the chatbot understands one’s feelings may create a sense of belonging and acceptance 64. 9 propose that chatbots for second language studies should use congratulatory, encouraging, sympathetic, and reassuring utterances to create a friendly atmosphere to the learner. The authors statistically demonstrated that affective backchannel, combined with communicative strategies (see Section 3.1.2) significantly increased learners’ confidence and desire to communicate while reducing anxiety. In another educational study, 71 evaluated the impact of chatbot’s affective moves on being friendly and achieving social belonging. Qualitative results show that affective moves significantly improve the perception of amicability, and marginally increased social belonging. According to 142, when a chatbot’s emotional reaction triggers a social response from the user, then the chatbot had achieved group membership and the users’ sympathy. 40 proposed a chatbot that uses empathic and self-oriented emotional expressions to keep users engaged in quiz-style dialog. The survey results revealed that empathic expressions significantly improved user satisfaction. In addition, the empathic expressions also improved the user ratings of the peer agent regarding intimacy, compassion, amiability, and encouragement. Although 40 did not find effect of chatbot’s self-disclosure on emotional connection, 74 found that self-disclosure and reciprocity significantly improved trust and interactional enjoyment. In 49, seven participants reported that the best thing about their experience with the therapist chatbot was the perceived empathy. Five participants highlighted that the chatbot demonstrated attention to their users’ feelings. In addition, the users referred the chatbot as “he,” “a friend,” “a fun little dude,” which demonstrates that the empathy was directed to the personification of the chatbot. In another mental health care study, 87 found that humans are twice as likely to mirror negative sentiment from a chatbot than from a human, which is a relevant implication for therapeutic interactions. In 150, participants reported that some content is embarrassing to ask another human, thus, talking to a chatbot would be easier due to the lack of judgement. 64 measured users’ experience in conversations with a chatbot compared to a human partner as well as the amount of intimacy disclosure and cognitive reappraisal. Participants in the chatbots condition experienced as many emotional, relational, and psychological benefits as participants who disclosed to a human partner.
[B2] to increase engagement: 122 argue that longer conversations (10+ turns) are needed to meet the purpose of fulfilling the needs of affection and belonging. Therefore, the authors defined conversation-turns per session as a success metric for chatbots, where usefulness and emotional understanding are combined. In 40, empathic utterances for the quiz-style interaction significantly increase the number of users’ messages per hint for both answers and non-answers utterances (such as feedback about the success/failure). This result shows that empathic utterances encouraged the users to engage and utter non-answers statements. 102 compared the possibility of emotional connection between a classical chatbot and a pretended chatbot, simulated in a WoZ experiment. Quantitative results showed that the WoZ condition was more engaging, since it resulted in conversations that lasted longer, with a higher number of turns. The analysis of the conversational logs revealed the positive effect of the chatbot manifesting social cues and empathic signs as well as touching on personal topics.
[B3] to increase believability: 88 argue that adapting chatbots language to their current emotional state, along with their personality and social role awareness, results in more believable interactions. The authors propose that conversation acts should reflect the pretended emotional status of the agent; the extent to which the acts impact on emotion depends however on the agent’s personality (e.g., its temperament or tolerance). Personality is an anthropomorphic characteristic and is discussed in Section 3.3.2.
Although emotional intelligence is the goal of several studies, [C1] regulating affective reactions is still a challenge. The chatbot presented in 71 was designed to mimic the patterns of affective moves in human-human interactions. Nevertheless, the chatbot has shown an only marginally significant increase in social belonging, when compared to the same interaction with a human partner. Conversational logs revealed that the human tutor performed a significantly higher number of affective moves in that context. In 67, the chatbot was designed to present emotive-like cues, such as exclamation marks, and interjections. The participants rated the degree of emotions in chatbot’s responses negatively. In 131, the energetic chatbot was reported as having an enthusiasm “too high to match with.” In contrast, the chatbot described as an “emotional buddy” was reported as being “overly caring.” 64 state that chatbot’s empathic utterances may be seen as pre-programmed and inauthentic. Although their results revealed that the partners’ identity (chatbot vs. person) had no effect in the perceived relational and emotional experience, the chatbot condition was a WoZ setup. The wizards were blind to whether users thought they were talking to a chatbot or a person, which reveals that identity does not matter if the challenge of regulating emotions is overcome.
The chatbots literature also report some strategies to manifest emotional intelligence:
[S1] using social-emotional utterances: affective utterances toward the user are a common strategy to demonstrate emotional intelligence. 9, 71, and 40 suggest that affective utterances improve the interpersonal relationship with a tutor chatbot. In 9, the authors propose affective backchannel utterances (congratulatory, encouraging, sympathetic, and reassuring) to motivate the user to communicate in a second language. The tutor chatbot proposed in 71 uses solidarity, tension release, and agreement utterances to promote its social belonging and acceptance in group chats. 40 propose empathic utterances to express opinion about the difficulty or ease of a quiz, and feedback on success and failure.
[S2] to manifest conscientiousness: providing conscientiousness may affect the emotional connection between humans and chatbots. In 102, participants reported the rise of affection when the chatbot remembered something they have said before, even if it was just the user’s name. Keeping track of the conversation was reported as an empathic behavior and resulted in mutual affection. 122 argue that a chatbot needs to combine usefulness with emotion, by asking questions that help to clarify the users’ intentions. They provide an example where a user asks the time, and the chatbot answered “Cannot sleep?” as an attempted to guide the conversation to a more engaging direction. Adopting this strategy requires the chatbot to handle users’ message understanding, emotion and sentiment tracking, session context modeling, and user profiling 122.
[S3] reciprocity and self-disclosure: 74 hypothesized that a high level of self-disclosure and reciprocity in communication with chatbots would increase trust, intimacy, and enjoyment, ultimately improving user satisfaction and intention to use. They performed a WoZ, where the assumed chatbot was designed to recommend movies. Results demonstrated that reciprocity and self-disclosure are strong predictors of rapport and user satisfaction. In contrast, 40 did not find any effect of self-oriented emotional expressions in the users’ satisfaction or engagement (the number of utterances per hint). More research is needed to understand the extent to which this strategy produces positive impact on the interaction.
The literature shows that emotional intelligence is widely investigated, with particular interest from education and mental health care domains. Using emotional utterances in a personalized, context relevant way is still a challenge. Researchers in chatbots emotional intelligence can learn from emotional intelligence theory 115; 57 to adapt the chatbots utterances to match the emotions expressed in the dynamic context. Adaption to the dynamic context also improves the sense of personalized interactions, which is discussed in the next section.
Personalization refers to the ability of a technology to adapt its functionality, interface, information access, and content to increase its personal relevance to an individual or a category of individuals 46. In the chatbots domain, personalization may increase the agents’ social intelligence, since it allows a chatbot to be aware of situational context and to adapt dynamically its features to better suit individual needs 95. Grounded on robots and artificial agents’ literature, 76 argue that personalization can improve rapport and cooperation, ultimately increasing engagement with chatbots. Although some studies (see e.g., 151; 46; 76) also relate personalization to the attribution of personal qualities such as personality, we discuss personal qualities in the Personification category. In this section, we focus on the ability to adapt the interface, content, and behavior to the users’ preferences, needs, and situational context.
We found 11 studies that report personalization. Three studies pose personalization as a research goal 41; 76; 122. In most of the studies, though, personalization was observed in exploratory findings. In six studies, personalization emerged from the analysis of interviews and participants self-reported feedback 95; 42; 127; 131; 102; 67. In two studies 73; 135, needs for personalization emerged from the conversational logs.
The surveyed literature highlighted three benefits of providing personalized interactions:
[B1] to enrich interpersonal relationships: 42 state that personalizing the amount of personal information a chatbot can access and store is required to establish a relation of trust and reciprocity in workplace environments. In 95, interviews with 12 participants ended up in a total of 59 statements about how learning from experience promotes chatbot’s authenticity. 122 argue that chatbots whose focus is engagement need to personalize the generation of responses for different users’ backgrounds, personal interests, and needs in order to serve their needs for communication, affection, and social belonging. In 102, participants expressed the desire for the chatbot to provide different answers to different users. Although 41 has found no significant effect of personalization on the user experience with the financial assistant chatbot, the study applies personalization as the ability of giving empathic responses according to the users’ issues, where emotional intelligence plays a role. Interpersonal relationship can also be enriched by adapting the chatbots’ language to match the user’s context, energy, and formality; the ability of appropriately using language is discussed in Section 3.2.2.
[B2] to provide unique services: providing personalization increases the value of provided information 41. In the ethnography data collection study 127, eight participants reported dissatisfaction about the chatbot generic guidance to specific places. Participants self-reported that the chatbot should use their current location to direct them to places more conveniently located, and ask for participants interests and preferences aiming at directing them to areas that meet their needs. When exploring how teammates used a task-assignment chatbot, 135 found that the use of the chatbot varied depending on the participants’ levels of hierarchy. Similarly, qualitative analysis of perceived interruption in a workplace chat 76 suggest that interruption is likely associated with users’ general aversion to unsolicited messages at work. Hence, the authors argue that chatbot’s messages should be personalized to the user’s general preference. 76 also found that users with low social-agent orientation emphasize the utilitarian value of the system, while users with high social-agent orientation see the system as a humanized assistant. This outcome advocates to the need of personalizing the interaction to individual users’ mental model. In 131, participants reported preference for a chatbot that remember their details, likes and dislikes, and preferences, and use the information to voluntarily make useful recommendations. In 66, two participants also expected chatbots to retain context from previous interactions to improve recommendations.
[B3] to reduce interactional breakdowns: in HCI, personalization is used to customize the interface to be more familiar to the user 46. When evaluating visual elements (suck as quick replies) compared to typing the responses, 73 observed that users that start the interaction by clicking an option are more likely to continue the conversation if the next exchange also has visual elements as optional affordances. In contrast, users who typed are more likely to abandon the conversation when they are facing options to click. Thus, chatbots should adapt their interface to users’ preferred input methods. In 67, one participant suggested that the choice of text color and font size should be customizable. 41 also observed that participants faced difficulties with small letters, and concluded that adapting the interface to provide accessibility also needs to be considered.
According to the surveyed literature, the main challenge regarding personalization is [C1] privacy. To enrich the efficiency and productivity of the interaction, a chatbot needs to have memory of previous interactions as well as learn user’s preferences and disclosed personal information 131. However, as 42 and 131 suggest, collecting personal data may lead to privacy concerns. Thus, chatbots should showcase transparent purpose and ethical standard 95. 131 also suggest that there should be a way to inform a chatbot that something in the conversation is private. Similarly, participants in 150 reported that personal data and social media content may be inappropriate topics for chatbots because they can be sensitive. These concerns may be reduced if a chatbot demonstrates care about privacy 42.
The reported strategies to provide personalization in chatbots interactions are the following:
[S1] to learn from and about the user: 95 state that chatbots should present strategies to learn from cultural, behavioral, personal, conversational, and contextual interaction data. For example, the authors suggest using Facebook profile information to build knowledge about users’ personal information. 131 also suggest that the chatbot should remember user’s preferences disclosed in previous conversations. In 122, the authors propose an architecture where responses are generated based on a personalization rank that applies users’ feedback about their general interests and preferences. When evaluating the user’s experience with a virtual assistant chatbot, 150 found 16 mentions to personalized interactions, where participants showed needs for a chatbot to know their personal quirks and to anticipate their needs.
[S2] to provide customizable agents: 76 suggest that users should be able to choose the level of the chatbot’s attributes, for example, the agent’s look and persona. By doing so, users with low social-agent orientation could use a non-humanized interface, which would better represent their initial perspective. This differentiation could be the first signal to personalize further conversation, such as focusing on more productive or playful interactions. Regarding chatbots’ learning capabilities, in 42, interviews with potential users revealed that users should be able to manage what information the chatbot know about them and to decide whether the chatbot can learn from previous interactions or not. If the user prefers a more generic chatbot, then it would not store personal data, potentially increasing the engagement with more resistant users. 131 raise the possibility of having an “incognito” mode for chatbots or ask for the chatbot to forget what was said in previous utterances.
[S3] visual elements: 127 adopted quick replies as a mean to the chatbot to tailor its subsequent questions to the specific experience the participant had reported. As discussed in Section 3.1.2, quick replies may be seen as restrictive from an interactional perspective; however, conversation logs showed that the tailored questions prompted the users to report more details about their experience, which is important in ethnography research.
Both the benefits and strategies identified from the literature are in line with the types of personalization proposed by 46. Therefore, further investigations in personalization can leverage the knowledge from interactive systems (e.g., 46; 132) to adapt personalization strategies and handling the privacy concern.
In this section, we discuss the influence of identity projection on human-chatbot interaction. Personification refers to assigning personal traits to non-human agents, including physical appearance, and emotional states 46. In the HCI field, researchers argue that using a personified character in the user interface is a natural way to support the interaction 70. Indeed, the literature shows that (i) users can be induced to behave as if computers were humans, even when they consciously know that human attributes are inappropriate 92; and (ii) the more human-like a computer representation is, the more social people’s responses are 56.
Chatbots are, by definition, designed to have at least one human-like trait: the (human) natural language. Although research on personification is obviously more common in Embodied Conversational Agents field, 34 claim that a chatbot’s body can be created through narrative without any visual help. According to 32, talking to a machine affords it a new identity. In this section, we divided the social characteristics that reflect personification into identity (16 papers) and personality (12 papers). In this category, we found several studies where part of the main investigation relates to the social characteristics. See the supplementary materials for details (Appendix A).
Identity refers to the ability of an individual to demonstrate belonging to a particular social group 125. Although chatbots do not have the agency to decide what social group they want to belong, designers attribute identity to them, intentionally or not, when they define the way a chatbot talks or behaves 23. The identity of a partner (even if only perceived) gives rise to new processes, expectations, and effects that affect the outcomes of the interaction 64. Aspects that convey the chatbots’ identity include gender, age, language style, and name. Additionally, chatbots may have anthropomorphic, zoomorphic, or robotic representations. Some authors include identity aspects in the definition of personality (see, e.g., 122). We distinguish these two characteristics, where identity refers to the appearance and cultural traits while personality focus on behavioral traits.
We found 16 studies that discuss identity issues, ten of which have identity as part of their main investigation 77; 67; 21; 25; 34; 33; 117; 80; 28; 5. In two studies, the authors argue on the impact of identity on the interaction based on the literature 54; 17. In three studies 123; 131; 135; 95, qualitative analysis of conversational logs revealed that participants put efforts into understanding aspects of the chatbots’ identity.
The identified benefits of attributing identity to a chatbot are the following:
[B1] to increase engagement: when evaluating signals of playful interactions, 77 found that agent-oriented conversations (asking about agent’s traits and status) are consistent with the tendency to anthropomorphize the agent and engage in chit-chat. In the educational scenario, 123 also observed questions about agent’s appearance, intellectual capacities, and sexual orientation, although the researchers considered these questions inappropriate for the context of tutoring chatbots. When comparing human-like vs. machine-like language style, greetings, and chatbot’s framing, 5 noticed that using informal language, having a human name, and using greetings associated with human communication resulted in significantly higher scores for adjectives like likeable, friendly, and personal. In addition, framing the agent as “intelligent” also had a slightly influence in users’ scores.
[B2] to increase human-likeness: some attributes may convey a perceived human-likeness. 5 showed that using a human-like language style, name, and greetings resulted in significantly higher scores for naturalness. The chatbot’s framing influenced the outcomes when combined with other anthropomorphic clues. When evaluating different typefaces for a financial adviser chatbot, 21 found that users perceive machine-like typefaces as more chatbot-like, although they did not find strong evidence of handwriting-like typefaces conveying humanness.
The surveyed literature also highlights challenges regarding identity:
[C1] to avoid negative stereotypes: when engaging in a conversation, interlocutors base their behavior on common ground (joint knowledge, background facts, assumptions, and beliefs that participants have of each other (see 34). Common ground reflects stereotypical attributions that chatbots should be able to manage as the conversation evolves 32. In 34, the authors discuss that chatbots for company representations are often personified as attractive human-like women acting as spokespeople for their companies, while men chatbots tend to have a more important position, such as a virtual CEO. 33 state that the agent self-disclosure of gender identity opens possibilities to sex talk. The authors observed that the conversations mirror stereotyped male/female encounters, and the ambiguity of the chatbot’s gender may influence the exploration of homosexuality. However, fewer instances were observed of sex-talk with the chatbot personified as a robot, which demonstrates that the gender identity may lead to the stereotypical attributions. When evaluating the effect of gender identity on disinhibition, 15 showed that people spoke more often about physical appearance and used more swear and sexual words with the female-presenting chatbot, and racist attacks were observed in interactions with chatbots represented as a black person. The conversation logs from 67 also show instances of users attacking the chatbot persona (a static avatar of a woman pointing to the conversation box). 80 and 117 also highlight that race identity conveys not only the physical appearance, but all the socio-cultural expectations about the represented group (see discussion in Section 3.2.4). Hence, designers should care about the impact of attributing an identity to chatbots in order to avoid reinforcing negative stereotypes.
[C2] to balance the identity and the technical capabilities: literature comparing embodied vs. disembodied conversational agents have contradictory results regarding the relevance of a human representation. For example, in the context of general-purpose interactions, 28 show that people demonstrate more efforts toward establishing common ground with the agent when they are represented as fully human; in contrast, when evaluating a website assistant chatbot, 25 show that simpler text-based chatbots with no visual, human identity resulted in lesser uncanny effect and less negative affect. Overly humanized agents create a higher expectation on users, which eventually leads to more frustration when the chatbot fails 54. When arguing on why chatbots fail, 17 advocate for balancing human versus robot aspects, where “too human” representations may lead to off-topic conversations and overly robotic interactions may lack personal touch and flexibility. When arguing on the social presence conveyed by deceptive chatbots, 32 state that extreme anthropomorphic features may generate cognitive dissonance. The challenge, thus, lies on designing a chatbot that provides appropriate identity cues, corresponding to their capabilities and communicative purpose, in order to convey the right expectation and minimize negative discomforts caused by over personification.
Regarding the strategies, the surveyed literature suggests [S1] to design and elaborate on a persona. Chatbots should have a comprehensive persona and answer agent-oriented conversations with a consistent description of itself 77; 95. For example, 32 discuss that Eliza, the psychotherapist chatbot, and Parry, a paranoid chatbot, have behaviors that are consistent with the stereotypes associated with the professional and personal identities, respectively. 135 suggest that designers should explicitly build signals of the chatbot personification (either machine- or human-like), so the users can have the right expectation about the interaction. When identity aspects are not explicit, users try to establish common ground. In 77 and 123, many of the small talk with the chatbot related to the chatbot’s traits and status. In 34, the authors observed many instances of Alice’s self-references to “her” artificial nature. These references triggered the users to reflect on their human-condition (self-categorization process), resulting in exchanges about their species (either informational or confrontational). Similar results were observed by 131, as participants engaged in conversations about the artificial nature of the agent. Providing the chatbot with the ability to describe its personal identity helps to establish the common ground, and hence, enrich the interpersonal relationship 34.
Chatbots may be designed to deceive users about its actual identity, pretending to be a human 32. In this case, the more human the chatbot sounds, the more successful it is. In many cases, however, there is no need to engage in deception and the chatbots can be designed to represent an elaborated persona. Researchers can explore social identity theory 125; 20 regarding to ingroup bias, power relations, homogeneity, and stereotyping, in order to design chatbots with identity traits that reflect their expected social position 60.
Personality refers to personal traits that help to predict someone’s thinking, feeling, and behaving 85. The most accepted set of traits is called Five-Factor model (or Big Five model) 55; 85), which describes personality in five dimensions (extraversion, agreeableness, conscientiousness, neuroticism, and openness). However, personality can also refer to other dynamic, behavioral characteristics, such as temperament and sense of humor 153; 134. In the chatbots domain, personality refers to the set of traits that determines the agent’s interaction style, describes its character, and allows the end-user to understand its general behavior 34. Chatbots with consistent personality are more predictable and trustable 122. According to 35, unpredictable swings in chatbot’s attitudes can disorient the users and create a strong sense of discomfort. Thus, personality ensures that a chatbot displays behaviors that stand in agreement with the users’ expectations in a particular context 101.
We found 12 studies that report personality issues for chatbots. In some studies, personality was investigated in reference to the Big Five model 88; 124; 79, while two studies focused on sense of humor 106; 86. Three studies investigated the impact of the personality of tutor chatbots on students’ engagement 124; 9; 71. 131 compared users’ preferences regarding pre-defined personalities. In the remaining studies, 102; 122; 16; 66 personality concerns emerged from the qualitative analysis of the interviews, users’ subjective feedback, and literature reviews 86.
The surveyed literature revealed two benefits of attributing personality to chatbots:
[B1] to exhibit believable behavior: 88 states that chatbots should have a personality, defined by the Five Factor model plus characteristics such as temperament and tolerance, in order to build utterances using linguistic choices that coheres with these attributions. When evaluating a joking chatbot, 106 compared the naturalness of the chatbot’s inputs and the chatbot’s human-likeness compared to a no-joking chatbot. The joking chatbot scored significantly higher in both constructs. 102 also showed that sense of humor humanizes the interactions, since humor was one of the factors that influenced the naturalness for the WoZ condition. 79 demonstrated that manipulating language to manifest a target personality produced moderately natural utterances, with a mean rating of 4.59 out of 7 for the personality model utterances.
[B2] to enrich interpersonal relationships: chatbots personality can make the interaction more enjoyable 16; 66. In the study from 16, the second most frequent motivation for using chatbots, pointed out by 20% of the participants, was entertainment. The authors argue that the chatbot being fun is important even when the main purpose is productivity; according to participants, the chatbot’s “fun tips” enrich the user experience. This result is consistent with the experience of first-time users 66, where participants relate better with chatbots who have consistent personality. 131 show that witty banter and casual, enthusiastic conversations help to make the interaction effortless. In addition, a few participants enjoyed the chatbot with a caring personality, who was described as a good listener. In 122 and 124, the authors argue that a consistent personality helps the chatbot to gain the users’ confidence and trust. 124 state that tutor chatbots should display appropriate posture, conduct, and representation, which include being encouraging, expressive, and polite. Accordingly, other studies students desire for chatbots with positive agreeableness and extraversion 124; 9; 71. Outcomes consistently suggest that students prefer a chatbot that is not overly polite, but has some attitude. Agreeableness seems to play a critical role, helping the students to be encouraged and to deal with difficulties. Noticeably, agreeableness requires emotional intelligence to be warm and sympathetic in appropriate circumstances (see Section 3.2.5).
The reviewed literature pointed out two challenges regarding personality:
[C1] to adapt humor to the users’ culture: sense of humor is highly shaped by cultural environment 113. 106 discusses a Japanese chatbot who uses puns to create funny conversations. The authors state that puns are one of the main humor genres in that culture. However, puns are restricted to the culture and language they are built, with low portability level. Thus, the design challenge lies on personalizing chatbots’ sense of humor to the target users’ culture and interests or, alternatively, designing cross-cultural kinds of humor. The ability to adapt to the context and users’ preference is discussed in Section 3.2.6.
[C2] to balance personality traits: 131 observed that the users prefer the proactive, productive, witty chatbot. However, they also would like to add traits such as caring, encouraging, exciting. In 79, the researchers intentionally generated utterances to reflect extreme personalities; as a result, they observed that some utterances sounded unnatural because humans’ personality is a continuous phenomenon, rather than discrete. 124 also points out that, although personality is consistent, moods and states of mind constantly vary. Thus, balancing the predictability of the personality and the expected variation is a challenge to overcome.
We also identified strategies to design chatbots that manifest personality:
[S1] to use appropriate language: 122 and 88 suggest that the chatbot language should be consistently influenced by its personality. Both studies propose that chatbots architecture should include a persona-based model that encode the personality and influence the response generation. The framework proposed by 79 shows that it is possible to automatically manipulate language features to manifest a particular personality based on the Big Five model. The Big-Five model is a relevant tool because it can be assessed using validated psychological instruments 84. Using this model to represent the personality of chatbots was also suggested by 88 and 124. 66 discussed that the chatbot personality should match its domain. Participants expected the language used by the news chatbot to be professional, while they expected the shopping chatbot to be casual and humorous. The ability to use consistent language is discussed in Section 3.2.2.
[S2] to have a sense of humor: literature highlights humor as a positive personality trait 86. In 66, ten participants mentioned enjoyment when the chatbots provided humorous and highly diverse responses. The authors found occurrences of the participants asking for jokes and being delighted when the request was supported. 16 present similar results when arguing that humor is important even for task-oriented chatbots when the user is usually seeking for productivity. For casual conversations, 131 highlight that timely, relevant, and clever wit is a desired personality trait. In 106, the joker chatbot was perceived as more human-like, knowledgeable and funny, and participants felt more engaged.
Personality for artificial agents has been studied for a while in the Artificial Intelligence field 44; 111; 101. Thus, further investigations on chatbots’ personality can leverage models for personality and evaluate how they contribute to believability and rapport building.
4. Interrelationships among the characteristics
In Section 3, we organized the social characteristics into discrete groups. However, we discussed several instances of characteristics influencing each other, or being used as a strategy to manifest one another. In this section, we describe these relations in a theoretical framework, depicted in Figure 1. Boxes represent the social characteristics and the colors group them into their respective categories. The axes represent the 22 propositions we derived from the literature.
According to the surveyed literature, proactivity influences the perceived personality (P1) 131, since recommending and initiating topics may manifest higher levels of extraversion. When the proactive messages are based on the context, proactivity increases perceived conscientiousness (P2) 131, since the proactive messages may demonstrate attention to the topic. Proactivity supports communicability (P3) 24; 123; 138, since a chatbot can proactively communicate its knowledge and provide guidance; in addition, proactivity supports damage control (P4) 123, since a chatbot can introduce new topics when the user either is misunderstood, try to break the system, or send an inappropriate message.
Conscientiousness is by itself a dimension of the Big-Five model; hence, conscientiousness influences the perceived personality (P5) 55. Higher levels of context management, goal-orientation, and understanding increase the chatbots’ perceived efficiency, organization, and commitment 43. Conscientiousness manifests emotional intelligence (P6) since retaining information from previous turns and being able to recall them show empathy 66. In addition, conscientiousness manifests personalization (P7) 66; 131 because a chatbot can remember individual preferences within and across sessions.
Emotional intelligence influences the perceived personality (P8), since chatbots’ personality traits affect the intensity of the emotional reactions 88; 131. Agreeableness is demonstrated through consistent warm reactions such as encouraging and motivating 124; 9; 71; 122. Some personality traits require personalization (P9) to adapt to the interlocutors’ culture and interests 106. Besides, personalization benefits identity (P10), since the users’ social-agent orientation may require a chatbot to adapt the level of engagement in small talk and the agent’s visual representation 67; 76. Personalization also improves emotional intelligence (P11), since a chatbot should dynamically regulate the affective reactions to the interlocutor 131; 122. Emotional intelligence improves perceived manners (P12), since the lack of emotional intelligence may lead to the perception of impoliteness 67.
Conscientiousness facilitates damage control (P13), since the attention to the workflow and context may increase the ability to recover from a failure without restarting the workflow 41; 43; 54; 65. Communicability facilitates damage control (P14), since it teaches the user how to communicate, reducing the numbers of mistakes 142; 123; 66; 54; 41. In addition, suggesting how to interact can reduce frustration after failure scenarios 77.
Personalization manifest thoroughness (P15) 131; 54; 62; 41, since chatbots can adapt their language use to the conversational context and the interlocutor’s expectations. When the context requires dynamic variation 131; 54, thoroughness may reveal traits of the chatbot’s identity (P16) 80; 117. As demonstrated by 79, thoroughness also reveals personality (P17).
Manners influence conscientiousness (P18) 142, since it can be applied as a strategy to politely refuse off-topic requests and to keep the conversation on track. Manners also influence damage control (P19) 29; 142, because it can help a chatbot to prevent verbal abuse and reduce the negative effect of lack of knowledge. Both moral agency (P20) and emotional intelligence (P21) improve damage control because they provide the ability to appropriately respond to abuse and testing 142; 123. Identity influences moral agency (P22), since identity representations require the ability to prevent a chatbot from building or reinforcing negative stereotypes 117; 15; 80.
5. Related Surveys
Previous studies have reviewed the literature on chatbots. Several surveys discuss chatbot’s urgency 31; 100 and their potential application for particular domains, which include education 119; 38; 112; 116; 149, business 119; 38, health 45; 72, information retrieval and e-commerce 119. Other surveys focus on technical design techniques 108; 133; 149; 3; 139; 39; 81, such as language generation models, knowledge management, and architectural challenges. Although 7 discuss social capabilities of chatbots, the survey focuses on the potential of available open source technologies to support these skills, highlighting technical hurdles rather than social ones.
We found three surveys 100; 107; 47 that include insights about social characteristics of chatbots, although none of them focus on this theme. 100 investigated chatbots that “mimic conversation rather than understand it,” and review the main technologies and ideas that support their design, while 47 focuses on identifying best practices for developing script-based chatbots. 107 review the literature on quality issues and attributes for chatbots. The supplementary materials include a table that shows the social characteristics covered by each survey (Appendix A). These related surveys also point out technical characteristics and attributes that are outside the scope of this survey.
This research has some limitations. Firstly, since this survey focused on disembodied, text-based chatbots, the literature on embodied and speech-based conversational agents was left out. We acknowledge that studies that include these attributes can have relevant social characteristics for chatbots, especially for characteristics that could be highly influenced by physical representations, tone, accent, and so forth (e.g. identity, politeness, and thoroughness). However, embodiment and speech could also bring new challenges (e.g., speech-recognition or eye-gazing), which are out of the scope of this study and could potentially impact the users experience with the chatbots. Secondly, since the definition of chatbot is not consolidated in the literature and chatbots have been studied in several different domains, some studies that include social aspects of chatbots may have not be found. To account for that, we adopted several synonyms in our research string and used Google Scholar as search engine, which provides a fairly comprehensive indexing of the literature in more domains. Finally, the conceptual model of social characteristics was derived through a coding process inspired in qualitative methods, such as Grounded Theory. Like any qualitative coding methods, it relies on the researchers’ subjective assessment. To mitigate this threat, the researchers discussed the social characteristics and categories during in-person meetings until reaching consensus, and the conceptual framework along with the relationship among characteristics were derived considering outcomes explicitly reported in the surveyed studies.
In this survey, we investigated the literature on disembodied, text-based chatbots to answer the question “What chatbot social characteristic benefit human interactions and what are the challenges and strategies associated with them?”. Our main contribution is the conceptual model of social characteristics, from which we can derive conclusions about several research opportunities. Firstly, we point out several challenges to overcome in order to design chatbots that manifest each characteristic. Secondly, further research may focus on assessing the extent to which the identified benefits are perceived by the users and influence users’ satisfaction. Finally, further investigations may propose new strategies to manifest particular characteristics. In this sense, we highlight that we could not identify strategies to manifest moral agency and thoroughness, although strategies for several other characteristics are also under-investigated. We also discussed the relationship among the characteristics. Our results give important references for helping designers and researchers find opportunities to advance the human-chatbot interactions field.
- Abdul-Kader and Woods (2015) Sameera A Abdul-Kader and JC Woods. 2015. Survey on chatbot design techniques in speech conversation systems. IJACSA 6, 7 (2015).
- Ahmad et al. (2018) Nahdatul Akma Ahmad, Mohamad Hafiz Che, Azaliza Zainal, Muhammad Fairuz Abd Rauf, and Zuraidy Adnan. 2018. Review of Chatbots Design Techniques. International Journal of Computer Applications 181, 8 (Aug. 2018), 7–10.
- Allen et al. (2006) Colin Allen, Wendell Wallach, and Iva Smit. 2006. Why machine ethics? IEEE Intelligent Systems 21, 4 (2006), 12–17.
- Araujo (2018) Theo Araujo. 2018. Living up to the chatbot hype: The influence of anthropomorphic design cues and communicative agency framing on conversational agent and company perceptions. Comput. Hum. Behav. 85 (2018), 183–189.
- Auerbach and Silverstein (2003) Carl Auerbach and Louise B Silverstein. 2003. Qualitative data: An introduction to coding and analysis. NYU press.
- Augello et al. (2017) Agnese Augello, Manuel Gentile, and Frank Dignum. 2017. An Overview of Open-Source Chatbots Social Skills. In INSCI. Springer, 236–248.
- Avula et al. (2018) Sandeep Avula, Gordon Chadwick, Jaime Arguello, and Robert Capra. 2018. SearchBots: User Engagement with ChatBots during Collaborative Search. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval. ACM, 52–61.
- Ayedoun et al. (2017) Emmanuel Ayedoun, Yuki Hayashi, and Kazuhisa Seta. 2017. Can Conversational Agents Foster LearnersẂillingness to Communicate in a Second Language?: effects of communication strategies and affective backchannels. In Proceedings of the 25th International Conference on Computers in Education, W Chen et al. (Eds.).
- Ayedoun et al. (2018) Emmanuel Ayedoun, Yuki Hayashi, and Kazuhisa Seta. 2018. Adding Communicative and Affective Strategies to an Embodied Conversational Agent to Enhance Second Language Learners’ Willingness to Communicate. International Journal of Artificial Intelligence in Education (2018), 1–29.
- Banks (2018) Jaime Banks. 2018. A Perceived Moral Agency Scale: Development and Validation of a Metric for Humans and Social Machines. Comput. Hum. Behav. (2018).
- Baron (1984) Naomi S Baron. 1984. Computer mediated communication as a force in language change. Visible language 18, 2 (1984), 118.
- Björkqvist et al. (2000) Kaj Björkqvist, Karin Österman, and Ari Kaukiainen. 2000. Social intelligence-empathy=aggression? Aggression and violent behavior 5, 2 (2000), 191–200.
- Boiteux (2019) Marion Boiteux. 2019. Messenger at F8 2018. Retrieved October 18, 2019 from https://bit.ly/2zXVnPH. Messenger Developer Blog.
- Brahnam and De Angeli (2012) Sheryl Brahnam and Antonella De Angeli. 2012. Gender affordances of conversational agents. Interacting with Computers 24, 3 (2012), 139–153.
- Brandtzaeg and Følstad (2017) Petter Bae Brandtzaeg and Asbjørn Følstad. 2017. Why people use chatbots. In INSCI. Springer, 377–392.
- Brandtzaeg and Følstad (2018) Petter Bae Brandtzaeg and Asbjørn Følstad. 2018. Chatbots: changing user needs and motivations. Interactions 25, 5 (2018), 38–43.
- Brown (2015) Penelope Brown. 2015. Politeness and language. In IESBS, 2nd ed. Elsevier, 326–330.
- Brown and Levinson. (1987) Penelope Brown and Stephen C. Levinson. 1987. Politeness: Some universals in language usage. Vol. 4. Cambridge university press.
- Brown (2000) Rupert Brown. 2000. Social identity theory: Past achievements, current problems and future challenges. Eur. J. Soc. Psychol. 30, 6 (2000), 745–778.
- Candello et al. (2017) Heloisa Candello, Claudio Pinhanez, and Flavio Figueiredo. 2017. Typefaces and the Perception of Humanness in Natural Language Chatbots. In Proc. of the SIGCHI CHI Conference. ACM, 3476–3487.
- Candello et al. (2018) Heloisa Candello, Claudio Pinhanez, Mauro Carlos Pichiliani, Melina Alberio Guerra, and Maira Gatti de Bayser. 2018. Having an Animated Coffee with a Group of Chatbots from the 19 th Century. In Proc. of the SIGCHI CHI Conference (Extended Abstract). ACM, D206.
- Cassell (2009) Justine Cassell. 2009. Social practice: Becoming enculturated in human-computer interaction. In Int Conf on UAHCI. Springer, 303–313.
- Chaves and Gerosa (2018) Ana Paula Chaves and Marco Aurelio Gerosa. 2018. Single or Multiple Conversational Agents?: An Interactional Coherence Comparison. In Proc. of the SIGCHI CHI Conference. ACM, 191.
- Ciechanowski et al. (2018) Leon Ciechanowski, Aleksandra Przegalinska, Mikolaj Magnuski, and Peter Gloor. 2018. In the shades of the uncanny valley: An experimental study of human–chatbot interaction. Future Generation Computer Systems (2018).
- Coniam (2008) David Coniam. 2008. Evaluating the language resources of chatbots for their potential in English as a second language learning. ReCALL 20, 1 (2008), 99–116.
- Conrad and Biber (2009) Susan Conrad and Douglas Biber. 2009. Register, genre, and style. Cambridge University Press.
- Corti and Gillespie (2016) Kevin Corti and Alex Gillespie. 2016. Co-constructing intersubjectivity with artificial conversational agents: people are more likely to initiate repairs of misunderstandings with agents represented as human. Comput. Hum. Behav. 58 (2016), 431–442.
- Curry and Rieser (2018) Amanda Cercas Curry and Verena Rieser. 2018. # MeToo Alexa: How Conversational Systems Respond to Sexual Harassment. In Proceedings of the Second ACL Workshop on Ethics in Natural Language Processing. 7–14.
- Dahlbäck et al. (1993) Nils Dahlbäck, Arne Jönsson, and Lars Ahrenberg. 1993. Wizard of Oz studies–why and how. Knowledge-based systems 6, 4 (1993), 258–266.
- Dale (2016) Robert Dale. 2016. The return of the chatbots. Natural Language Engineering 22, 5 (2016), 811–817.
- De Angeli (2005) Antonella De Angeli. 2005. To the rescue of a lost identity: Social perception in human-chatterbot interaction. In Virtual Agents Symposium. 7–14.
- De Angeli and Brahnam (2006) Antonella De Angeli and Sheryl Brahnam. 2006. Sex stereotypes and conversational agents. Proc. of Gender and Interaction: real and virtual women in a male world (2006).
- De Angeli et al. (2001a) Antonella De Angeli, Graham I Johnson, and Lynne Coventry. 2001a. The unfriendly user: exploring social reactions to chatterbots. In Proc. of the CAHD. 467–474.
- De Angeli et al. (2001b) Antonella De Angeli, Paula Lynch, and Graham Johnson. 2001b. Personifying the e-Market: A Framework for Social Agents.. In Interact. 198–205.
- de Bayser et al. (2017) Maíra Gatti de Bayser, Paulo Rodrigo Cavalin, Renan Souza, Alan Braz, Heloisa Candello, Claudio S. Pinhanez, and Jean-Pierre Briot. 2017. A Hybrid Architecture for Multi-Party Conversational Systems. CoRR arXiv/1705.01214 (2017).
- De Souza et al. (1999) Clarisse S De Souza, Raquel O Prates, and Simone DJ Barbosa. 1999. A method for evaluating software communicability. PUC-RioInf 1200 (1999), 11–99.
- Deryugina (2010) OV Deryugina. 2010. Chatterbots. Scientific and Technical Information Processing 37, 2 (2010), 143–147.
- Deshpande et al. (2017) Aditya Deshpande, Alisha Shahane, Darshana Gadre, Mrunmayi Deshpande, and Prachi M Joshi. 2017. A survey of various chatbot implementation techniques. International Journal of Computer Engineering and Applications, Special Issue XI (May 2017).
- Dohsaka et al. (2014) Kohji Dohsaka, Ryota Asai, Ryuichiro Higashinaka, Yasuhiro Minami, and Eisaku Maeda. 2014. Effects of conversational agents on activation of communication in thought-evoking multi-party dialogues. IEICE TRANSACTIONS on Information and Systems 97, 8 (2014), 2147–2156.
- Duijst (2017) Daniëlle Duijst. 2017. Can we Improve the User Experience of Chatbots with Personalisation. Master’s thesis. University of Amsterdam.
- Duijvelshoff (2017) Willem Duijvelshoff. 2017. Use-Cases and Ethics of Chatbots on Plek: a Social Intranet for Organizations. In Workshop On Chatbots And Artificial Intelligence.
- Dyke et al. (2013) Gregory Dyke, Iris Howley, David Adamson, Rohit Kumar, and Carolyn Penstein Rosé. 2013. Towards academically productive talk supported by conversational agents. In Intelligent Tutoring Systems, Cerri S.A., Clancey W.J., Papadourakis G., and Panourgia K. (Eds.). Springer, 459–476.
- Elliott (1994) Clark Elliott. 1994. Research problems in the use of as allow Artificial Intelligence model of personality and emotion. In AAAI-94 Proc.
- Fadhil (2018) Ahmed Fadhil. 2018. Can a Chatbot Determine My Diet?: Addressing Challenges of Chatbot Application for Meal Recommendation. CoRR arXiv:1802.09100 (2018).
- Fan and Poole (2006) Haiyan Fan and Marshall Scott Poole. 2006. What is personalization? Perspectives on the design and implementation of personalization in information systems. Journal of Organizational Computing and Electronic Commerce 16, 3-4 (2006), 179–202.
- Ferman (2018) Maria Ferman. 2018. Towards Best Practices for Chatbots. Master’s thesis. Universidad Villa Rica.
- Ferrara et al. (2016) Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM 59, 7 (2016), 96–104.
- Fitzpatrick et al. (2017) Kathleen Kara Fitzpatrick, Alison Darcy, and Molly Vierhile. 2017. Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR mental health 4, 2 (2017).
- Fitzpatrick and Winke (1979) Mary Anne Fitzpatrick and Jeff Winke. 1979. You always hurt the one you love: Strategies and tactics in interpersonal conflict. Commun. Q. 27, 1 (1979), 3–11.
- Fogg (2003) B.J. Fogg. 2003. Computers as persuasive social actors. In Persuasive Technology, B.J. Fogg (Ed.). Morgan Kaufmann, San Francisco, Chapter 5, 89 – 120.
- Følstad and Brandtzæg (2017) Asbjørn Følstad and Petter Bae Brandtzæg. 2017. Chatbots and the new world of HCI. interactions 24, 4 (2017), 38–42.
- Forlizzi et al. (2007) Jodi Forlizzi, John Zimmerman, Vince Mancuso, and Sonya Kwak. 2007. How interface agents affect interaction between humans and computers. In Proc. of the Conf. on DPPI. ACM, 209–221.
- Gnewuch et al. (2017) Ulrich Gnewuch, Stefan Morana, and Alexander Maedche. 2017. Towards designing cooperative and social conversational agents for customer service. In Proc. of the ICIS.
- Goldberg (1990) Lewis R Goldberg. 1990. An alternative” description of personality”: the big-five factor structure. J. Pers. Soc. Psychol. 59, 6 (1990), 1216.
- Gong (2008) Li Gong. 2008. How social is social responses to computers? The function of the degree of anthropomorphism in computer representations. Comput. Hum. Behav. 24, 4 (2008), 1494–1509.
- Gross (1998) James J Gross. 1998. The emerging field of emotion regulation: an integrative review. Review of general psychology 2, 3 (1998), 271.
- Grossman et al. (2009) Tovi Grossman, George Fitzmaurice, and Ramtin Attar. 2009. A survey of software learnability: metrics, methodologies and guidelines. In Proc. of the SIGCHI CHI Conference. ACM, 649–658.
- Gunawardena and Zittle (1997) Charlotte N Gunawardena and Frank J Zittle. 1997. Social presence as a predictor of satisfaction within a computer-mediated conferencing environment. American journal of distance education 11, 3 (1997), 8–26.
- Harré et al. (2003) Rom Harré, Fathali M Moghaddam, Fathali Moghaddam, et al. 2003. The self and others: Positioning individuals and groups in personal, political, and cultural contexts. Greenwood Publishing Group.
- Hayashi (2015) Yugo Hayashi. 2015. Social Facilitation Effects by Pedagogical Conversational Agent: Lexical Network Analysis in an Online Explanation Task. Proc. of the IEDMS (2015).
- Hill et al. (2015) Jennifer Hill, W Randolph Ford, and Ingrid G Farreras. 2015. Real conversations with artificial intelligence: A comparison between human–human online conversations and human–chatbot conversations. Comput. Hum. Behav. 49 (2015), 245–250.
- Himma (2009) Kenneth Einar Himma. 2009. Artificial agency, consciousness, and the criteria for moral agency: What properties must an artificial agent have to be a moral agent? Ethics and Information Technology 11, 1 (2009), 19–29.
- Ho et al. (2018) Annabell Ho, Jeff Hancock, and Adam S Miner. 2018. Psychological, Relational, and Emotional Effects of Self-Disclosure After Conversations With a Chatbot. Journal of Communication (2018).
- Jain et al. (2018a) Mohit Jain, Ramachandra Kota, Pratyush Kumar, and Shwetak N Patel. 2018a. Convey: Exploring the Use of a Context View for Chatbots. In Proc. of the SIGCHI CHI Conference. ACM, 468.
- Jain et al. (2018b) Mohit Jain, Pratyush Kumar, Ramachandra Kota, and Shwetak N Patel. 2018b. Evaluating and Informing the Design of Chatbots. In Proc. of the SIGCHI DIS. ACM, 895–906.
- Jenkins et al. (2007) Marie-Claire Jenkins, Richard Churchill, Stephen Cox, and Dan Smith. 2007. Analysis of user interaction with service oriented chatbot systems. In Int. Conf. on Hum. Comput. Interact. Springer, 76–83.
- Jiang and E Banchs (2017) Ridong Jiang and Rafael E Banchs. 2017. Towards Improving the Performance of Chat Oriented Dialogue System. In Proc. of the IALP. IEEE.
- Kirakowski et al. (2009) Jurek Kirakowski, Anthony Yiu, et al. 2009. Establishing the hallmarks of a convincing chatbot-human dialogue. In Human-Computer Interaction. InTech.
- Koda (2003) Tomoko Koda. 2003. User reactions to anthropomorphized interfaces. IEICE TRANSACTIONS on Information and Systems 86, 8 (2003), 1369–1377.
- Kumar et al. (2010) Rohit Kumar, Hua Ai, Jack L Beuth, and Carolyn P Rosé. 2010. Socially capable conversational tutors can be effective in collaborative learning situations. In International Conference on Intelligent Tutoring Systems. Springer, 156–164.
- Laranjo et al. (2018) Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, et al. 2018. Conversational agents in healthcare: a systematic review. J. Am. Med. Inform. Assoc. 25, 9 (2018), 1248–1258.
- Lasek and Jessa (2013) Mirosława Lasek and Szymon Jessa. 2013. Chatbots for Customer Service on Hotels’ Websites. Information Systems in Management 2, 2 (2013), 146–158.
- Lee and Choi (2017) SeoYoung Lee and Junho Choi. 2017. Enhancing user experience with conversational agent for movie recommendation: Effects of self-disclosure and reciprocity. Int J Hum Comput Stud. 103 (2017), 95–105.
- Li et al. (2017) Yanran Li, Hui Su, Xiaoyu Shen, Wenjie Li, Ziqiang Cao, and Shuzi Niu. 2017. DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. In IJCNLP.
- Liao et al. (2016) Vera Q Liao, Matthew Davis, Werner Geyer, Michael Muller, and N Sadat Shami. 2016. What can you do?: Studying social-agent orientation and agent proactive interactions with an agent for employees. In Proc. of the SIGCHI DIS. ACM, 264–275.
- Liao et al. (2018) Vera Q Liao, Muhammed Masud Hussain, Praveen Chandar, Matthew Davis, Marco Crasso, Dakuo Wang, Michael Muller, Sadat N Shami, and Werner Geyer. 2018. All Work and no Play? Conversations with a Question-and-Answer Chatbot in the Wild. In Proc. of the SIGCHI CHI Conference, Vol. 13.
- Luger and Sellen (2016) Ewa Luger and Abigail Sellen. 2016. Like Having a Really Bad PA: The Gulf between User Expectation and Experience of Conversational Agents. In Proc. of the SIGCHI CHI Conference. ACM, 5286–5297.
- Mairesse and Walker (2009) François Mairesse and Marilyn A Walker. 2009. Can Conversational Agents Express Big Five Personality Traits through Language?: Evaluating a Psychologically-Informed Language Generator. Cambridge & Sheffield, United Kingdom: University of Sheffield.
- Marino (2014) Mark C Marino. 2014. The racial formation of chatbots. CLCWeb: Comparative Literature and Culture 16, 5 (2014), 13.
- Masche and Le (2017) Julia Masche and Nguyen-Thinh Le. 2017. A Review of Technologies for Conversational Systems. In Int. Conf. on Computer Science, Applied Mathematics and Applications. Springer, 212–225.
- Maslowski et al. (2017) Irina Maslowski, Delphine Lagarde, and Chloé Clavel. 2017. In-the-wild chatbot corpus: from opinion analysis to interaction problem detection. In International Conference on Natural Language and Speech Processing.
- Mäurer and Weihe (2015) Daniel Mäurer and Karsten Weihe. 2015. Benjamin Franklin’s decision method is acceptable and helpful with a conversational agent. In Intelligent Interactive Multimedia Systems and Services. Springer, 109–120.
- McCrae and Costa Jr (1987) Robert R McCrae and Paul T Costa Jr. 1987. Validation of the five-factor model of personality across instruments and observers. J. Pers. Soc. Psychol. 52, 1 (1987), 81.
- McCrae and Costa Jr (1997) Robert R McCrae and Paul T Costa Jr. 1997. Personality trait structure as a human universal. American psychologist 52, 5 (1997), 509.
- Meany and Clark (2010) Michael M Meany and Tom Clark. 2010. Humour Theory and Conversational Agents: An Application in the Development of Computer-based Agents. International Journal of the Humanities 8, 5 (2010).
- Miner et al. (2016) Adam Miner, Amanda Chow, Sarah Adler, Ilia Zaitsev, Paul Tero, Alison Darcy, and Andreas Paepcke. 2016. Conversational Agents and Mental Health: Theory-Informed Assessment of Language and Affect. In Proc. of the Int. Conf. on HAI. ACM, 123–130.
- Morris (2002) Thomas William Morris. 2002. Conversational agents for game-like virtual environments. In AAAI Spring Symposium. 82–86.
- Morrissey and Kirakowski (2013) Kellie Morrissey and Jurek Kirakowski. 2013. ’Realness’ in Chatbots: Establishing Quantifiable Criteria. In Int. Conf. on Hum. Comput. Interact. Springer, 87–96.
- Mou and Xu (2017) Yi Mou and Kun Xu. 2017. The media inequality: Comparing the initial human-human and human-AI social interactions. Comput. Hum. Behav. 72 (2017), 432–440.
- Narita and Kitamura (2010) Tatsuya Narita and Yasuhiko Kitamura. 2010. Persuasive conversational agent with persuasion tactics. In International Conference on Persuasive Technology. Springer, 15–26.
- Nass et al. (1993) Clifford Nass, Jonathan Steuer, Ellen Tauber, and Heidi Reeder. 1993. Anthropomorphism, agency, and ethopoeia: computers as social actors. In Proc. of the INTERACT ’93 and CHI ’93. ACM, 111–112.
- Nass et al. (1994) Clifford Nass, Jonathan Steuer, and Ellen R Tauber. 1994. Computers are social actors. In Proc. of the SIGCHI CHI Conference. ACM, 72–78.
- Neff and Nagy (2016) Gina Neff and Peter Nagy. 2016. Automation, algorithms, and politics— talking to bots: symbiotic agency and the case of Tay. Int. J. Commun. 10 (2016), 17.
- Neururer et al. (2018) Mario Neururer, Stephan Schlögl, Luisa Brinkschulte, and Aleksander Groth. 2018. Perceptions on Authenticity in Chat Bots. Multimodal Technologies and Interaction 2, 3 (2018), 60.
- Nguyen and Sidorova (2018) Quynh N Nguyen and Anna Sidorova. 2018. Understanding user interactions with a chatbot: a self-determination theory approach. In AMCIS–ERF.
- Nowak and Rauh (2005) Kristine L Nowak and Christian Rauh. 2005. The influence of the avatar on online perceptions of anthropomorphism, androgyny, credibility, homophily, and attraction. Journal of Computer-Mediated Communication 11, 1 (2005), 153–178.
- OB́rien and Toms (2008) Heather L OB́rien and Elaine G Toms. 2008. What is user engagement? A conceptual framework for defining user engagement with technology. Journal of the American society for Information Science and Technology 59, 6 (2008), 938–955.
- Parthemore and Whitby (2013) Joel Parthemore and Blay Whitby. 2013. What makes any agent a moral agent? Reflections on machine consciousness and moral agency. Int. J. Mach. Consciousness 5, 02 (2013), 105–129.
- Pereira et al. (2016) Maria João Pereira, Luísa Coheur, Pedro Fialho, and Ricardo Ribeiro. 2016. Chatbots’ Greetings to Human-Computer Communication. CoRR arXiv:1609.06479 (2016).
- Petta and Trappl (1997) Paolo Petta and Robert Trappl. 1997. Why to create personalities for synthetic actors. In Creating Personalities for Synthetic Actors. Springer, 1–8.
- Portela and Granell-Canut (2017) Manuel Portela and Carlos Granell-Canut. 2017. A new friend in our smartphone?: observing interactions with chatbots in the search of emotional engagement. In Proc. of the Int. Conf. on Hum. Comput. Interact. ACM, 48.
- Postmes et al. (2001) Tom Postmes, Russell Spears, Khaled Sakhel, and Daphne De Groot. 2001. Social influence in computer-mediated communication: The effects of anonymity on group behavior. Personality and Social Psychology Bulletin 27, 10 (2001), 1243–1254.
- Prabhumoye et al. (2018) Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, and Alan W Black. 2018. Style Transfer Through Back-Translation. In Proc. of the 56th ACL.
- Prates et al. (2000) Raquel O Prates, Clarisse S de Souza, and Simone DJ Barbosa. 2000. Methods and tools: a method for evaluating the communicability of user interfaces. interactions 7, 1 (2000), 31–38.
- Ptaszynski et al. (2010) Michal Ptaszynski, Pawel Dybala, Shinsuke Higuhi, Wenhan Shi, Rafal Rzepka, and Kenji Araki. 2010. Towards Socialized Machines: Emotions and Sense of Humour in Conversational Agents. In Web Intelligence and Intelligent Agents. InTech.
- Radziwill and Benton (2017) Nicole M Radziwill and Morgan C Benton. 2017. Evaluating Quality of Chatbots and Intelligent Conversational Agents. CoRR arXiv:1704.04579 (2017).
- Ramesh et al. (2017) Kiran Ramesh, Surya Ravishankaran, Abhishek Joshi, and K Chandrasekaran. 2017. A Survey of Design Techniques for Conversational Agents. In International Conference on Information, Communication and Computing Technology. Springer, 336–350.
- Raven (1964) Bertram H Raven. 1964. Social influence and power. Technical Report. CALIFORNIA UNIV LOS ANGELES.
- Reeves and Nass (1996) Byron Reeves and Clifford Nass. 1996. The Media Equation: How people treat computers, television, and new media like real people and places. CSLI Publications and Cambridge university press.
- Rousseau and Hayes-Roth (1996) Daniel Rousseau and Barbara Hayes-Roth. 1996. Personality in synthetic agents. Technical Report. Citeseer.
- Rubin et al. (2010) Victoria L Rubin, Yimin Chen, and Lynne Marie Thorimbert. 2010. Artificially intelligent conversational agents in libraries. Library Hi Tech 28, 4 (2010), 496–522.
- Ruch (1998) Willibald Ruch. 1998. The sense of humor: Explorations of a personality characteristic. Vol. 3. Walter de Gruyter.
- Salovaara and Oulasvirta (2004) Antti Salovaara and Antti Oulasvirta. 2004. Six modes of proactive resource management: a user-centric typology for proactive behaviors. In Proceedings of the third Nordic conference on Human-computer interaction. ACM, 57–60.
- Salovey and Mayer (1990) Peter Salovey and John D Mayer. 1990. Emotional intelligence. Imagination, cognition and personality 9, 3 (1990), 185–211.
- Satu et al. (2015) Md Shahriare Satu, Md Hasnat Parvez, et al. 2015. Review of integrated applications with AIML based chatbot. In 1st Int. Conf. on ICCIE. IEEE, 87–90.
- Schlesinger et al. (2018) Ari Schlesinger, Kenton P O’Hara, and Alex S Taylor. 2018. Let’s Talk About Race: Identity, Chatbots, and AI. In Proc. of the SIGCHI CHI Conference. ACM, 315.
- Schuetzler et al. (2018) Ryan M Schuetzler, G Mark Grimes, and Justin Scott Giboney. 2018. An Investigation of Conversational Agent Relevance, Presence, and Engagement, In Americas Conference on Information Systems 2018 Proceedings. Americas’ Conference on Information Systems.
- Shawar and Atwell (2007) Bayan Abu Shawar and Eric Atwell. 2007. Chatbots: are they really useful?. In LDV Forum, Vol. 22. 29–49.
- Shechtman and Horowitz (2003) Nicole Shechtman and Leonard M Horowitz. 2003. Media inequality in conversation: how people behave differently when interacting with computers and people. In Proc. of the SIGCHI CHI Conference. ACM, 281–288.
- Short et al. (1976) John Short, Ederyn Williams, and Bruce Christie. 1976. The social psychology of telecommunications. (1976).
- Shum et al. (2018) Heung-yeung Shum, Xiao-dong He, and Di Li. 2018. From Eliza to XiaoIce: challenges and opportunities with social chatbots. Front. Inf. Technol. Electron. Eng. 19, 1 (2018), 10–26.
- Silvervarg and Jönsson (2013) Annika Silvervarg and Arne Jönsson. 2013. Iterative Development and Evaluation of a Social Conversational Agent. In 6th IJCNPL. Japan, 1223–1229.
- Sjödén et al. (2011) Björn Sjödén, Annika Silvervarg, Magnus Haake, and Agneta Gulz. 2011. Extending an Educational Math Game with a Pedagogical Conversational Agent: Facing Design Challenges. In Interdisciplinary Approaches to Adaptive Learning. A Look at the Neighbours. Springer, 116–130.
- Stets and Burke (2000) Jan E Stets and Peter J Burke. 2000. Identity theory and social identity theory. Social psychology quarterly (2000), 224–237.
- Sundar et al. (2016) S Shyam Sundar, Saraswathi Bellur, Jeeyun Oh, Haiyan Jia, and Hyang-Sook Kim. 2016. Theoretical importance of contingency in human-computer interaction: effects of message interactivity on user engagement. Commun. Res. 43, 5 (2016), 595–625.
- Tallyn et al. (2018) Ella Tallyn, Hector Fried, Rory Gianni, Amy Isard, and Chris Speed. 2018. The Ethnobot: Gathering Ethnographies in the Age of IoT. In Proc. of the SIGCHI CHI Conference. ACM, 604.
- Tamayo-Moreno and Pérez-Marín (2016) Silvia Tamayo-Moreno and Diana Pérez-Marín. 2016. Adapting the design and the use methodology of a Pedagogical Conversational Agent of Secondary Education to Childhood Education. In Computers in Education (SIIE), 2016 International Symposium on. IEEE, 1–6.
- Tegos et al. (2016) Stergios Tegos, Stavros Demetriadis, and Thrasyvoulos Tsiatsos. 2016. An Investigation of Conversational Agent Interventions Supporting Historical Reasoning in Primary Education. In Int Conf on ITS. Springer, 260–266.
- Tennenhouse (2000) David Tennenhouse. 2000. Proactive computing. Commun. ACM 43, 5 (2000), 43–50.
- Thies et al. (2017) Indrani Medhi Thies, Nandita Menon, Sneha Magapu, Manisha Subramony, and Jacki O’neill. 2017. How do you want your chatbot? An exploratory Wizard-of-Oz study with young, urban Indians. In IFIP Conf. on Hum. Comput. Interact. Springer, 441–459.
- Thomson (2005) Laura Thomson. 2005. A standard framework for web personalization. In 1st Int. Workshop on Innovations in Web Infrastructure - 14th WWW Conf. Japan.
- Thorne (2017) Camilo Thorne. 2017. Chatbots for troubleshooting: A survey. Language and Linguistics Compass (2017).
- Thorson and Powell (1993) James A Thorson and FC Powell. 1993. Sense of humor and dimensions of personality. Journal of clinical Psychology 49, 6 (1993), 799–809.
- Toxtli et al. (2018) Carlos Toxtli, Justin Cranshaw, et al. 2018. Understanding Chatbot-mediated Task Management. In Proc. of the SIGCHI CHI Conference. ACM, 58.
- Tu and McIsaac (2002) Chih-Hsiung Tu and Marina McIsaac. 2002. The relationship of social presence and interaction in online classes. The American journal of distance education 16, 3 (2002), 131–150.
- Turing (1950) Alan M Turing. 1950. Computing machinery and intelligence. Mind 59, 236 (1950), 433–460.
- Valério et al. (2017) Francisco AM Valério, Tatiane G Guimarães, Raquel O Prates, and Heloisa Candello. 2017. Here’s What I Can Do: Chatbots’ Strategies to Convey Their Features to Users. In Proceedings of the XVI Brazilian Symposium on Human Factors in Computing Systems. ACM, 28.
- Walgama and Hettige (2017) MS Walgama and B Hettige. 2017. Chatbots: The next generation in computer interfacing–A Review. KDU International Research Conference (2017).
- Walker (2009) Marilyn A Walker. 2009. Endowing Virtual Characters with Expressive Conversational Skills. In Int. Workshop on Intelligent Virtual Agents. Springer, 1–2.
- Wallace (2009) Richard S Wallace. 2009. The anatomy of A.L.I.C.E. In Parsing the Turing Test. Springer, 181–210.
- Wallis and Norling (2005) Peter Wallis and Emma Norling. 2005. The Trouble with Chatbots: social skills in a social world. Virtual Social Agents 29 (2005).
- Walther (1992) Joseph B Walther. 1992. Interpersonal effects in computer-mediated interaction: A relational perspective. Commun. Res. 19, 1 (1992), 52–90.
- Walther (1996) Joseph B Walther. 1996. Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Commun. Res. 23, 1 (1996), 3–43.
- Walther (2007) Joseph B Walther. 2007. Selective self-presentation in computer-mediated communication: Hyperpersonal dimensions of technology, language, and cognition. Comput. Hum. Behav. 23, 5 (2007), 2538–2557.
- Walther (2011) Joseph B Walther. 2011. Theories of computer-mediated communication and interpersonal relations (4 ed.). Sage, Thousand Oaks, CA, Chapter 4, 443–479.
- Watts (2003) Richard J Watts. 2003. Politeness. Cambridge University Press.
- Weizenbaum (1966) Joseph Weizenbaum. 1966. ELIZA-a computer program for the study of natural language communication between man and machine. Commun. ACM 9, 1 (1966), 36–45.
- Winkler and Söllner (2018) Rainer Winkler and Matthias Söllner. 2018. Unleashing the Potential of Chatbots in Education: A State-Of-The-Art Analysis. (2018).
- Zamora (2017) Jennifer Zamora. 2017. I’m Sorry, Dave, I’m Afraid I Can’t Do That: Chatbot Perception and Expectations. In Proc. of the Int. Conf. on HAI. ACM, 253–260.
- Zhang et al. (2018a) Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, and Jason Weston. 2018a. Personalizing Dialogue Agents: I have a dog, do you have pets too? arXiv preprint arXiv:1801.07243 (2018).
- Zhang et al. (2018b) Wei-Nan Zhang, Qingfu Zhu, Yifa Wang, Yanyan Zhao, and Ting Liu. 2018b. Neural personalized response generation as domain adaptation. World Wide Web (2018), 20.
- Zuckerman et al. (1993) Marvin Zuckerman, D Michael Kuhlman, Jeffrey Joireman, Paul Teta, and Michael Kraft. 1993. A comparison of three structural models for personality: The Big Three, the Big Five, and the Alternative Five. J. Pers. Soc. Psychol. 65, 4 (1993), 757.
Appendix A Supplemental Material
This supplementary material include a list of tables that summarize the outcomes presented in the paper. Additionally, we include insights on five constructs that can be used to assess whether social characteristics are reaching the intended design goals and leading to the expected benefits.
a.1. Overview of the surveyed literature
|Interaction type||Counting (%)||Surveyed studies|
|Task-oriented||34 (59%)||5 8 9 17 21 24 25 26 40 41 42 43 49 54 61 65 67 71 73 74 76 83 88 91 118 123 124 127 128 129 135 138 142 150|
|General purpose chat||19 (33%)||15 28 29 32 33 34 62 64 69 77 79 80 87 89 102 106 117 122 131|
|Both or not defined||5 (8%)||11 16 66 86 95|
|# of papers||Topics handled by the chatbots|
|16||Open domain (unrestricted topics)|
|2||E-commerce, financial services, game, health, information search, race, task management, virtual assistants|
|1||Business, Credibility assessment interviews, Decision-making coach, Ethnography, Human resources, Humor, Movie recommendation, News, Tourism|
a.2. Chatbots Social Characteristics
In this section, we present some tables to summarize the outcomes of the conceptual model of social characteristics. For the categories, Tables 4, 8, and 15 depicts an overview of the included studies. For the social characteristics, the tables shows the studies that address each listed benefit, challenge, and strategies.
a.2.1. Conversational Intelligence
|Study||Main investigation||Interaction||Analyzed data||Methods||Reported social characteristics|
|76||Social-agent orientation; Proactivity||Real chatbot||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Proactivity|
|8||Intervention mode||WoZ||Log of conversations; Questionnaires||Quantitative; Qualitative||Proactivity|
|118||Intervention mode; Users deceptive behavior||Real chatbot||Questionnaires||Quantitative||Proactivity; Conscientiousness|
|24||Sequential coherence||WoZ||Log of conversations; Think aloud; Interviews||Quantitative; Qualitative||Proactivity|
|102||Emotional engagement||Real chatbot; WoZ||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Proactivity|
|122||Emotional engagement||Real chatbot||Log of conversations||Qualitative||Proactivity|
|66||First-time users experience||Real chatbot||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Proactivity; Conscientiousness; Communicability|
|42||Privacy and ethics||WoZ||Workshop outcomes; Interviews||Qualitative||Proactivity|
|131||Personality traits||WoZ||Log of conversations; Focus group discussion; Interviews;||Qualitative||Proactivity|
|89||Naturalness||Real chatbot||Log of conversations; Interviews; Questionnaires||Quantitative; Qualitative||Proactivity; Conscientiousness|
|123||Iterative prototyping||Real chatbot||Log of conversations||Quantitative; Qualitative||Proactivity|
|83||Conversational decision-making||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Proactivity|
|43||Intervention mode (APT moves)||WoZ||Log of conversations||Quantitative||Proactivity; Conscientiousness|
|127||Ethnographic data collection||Real chatbot||Log of conversations; Interviews||Qualitative||Proactivity; Conscientiousness|
|135||Task management chatbot design||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Proactivity|
|49||Conversational mental health care||Real chatbot||Questionnaires||Quantitative||Proactivity|
|61||Intervention mode (APT moves)||Real chatbot||Log of conversations||Quantitative||Proactivity|
|129||Intervention mode (APT moves)||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Proactivity|
|65||Context management||Real chatbot||Log of conversations; Questionnaires; Subjective feedback||Quantitative; Qualitative||Conscientiousness|
|26||Language capabilities||Real chatbot||Log of conversations||Qualitative||Conscientiousness|
|9||Communication strategies and affective backchannels||Real chatbot||Questionnaires||Quantitative||Conscientiousness|
|54||Chatbots design principles||None||Literature review||Qualitative||Conscientiousness; Communicability|
|16||Users’ motivations||None||Questionnaires; Subjective feedback||Quantitative; Qualitative||Conscientiousness|
|41||Personalization||Real chatbot||Questionnaires; Think aloud; Interviews||Quantitative; Qualitative||Conscientiousness; Communicability|
|138||Communicability||Real chatbot||Semiotic Inspection||Qualitative||Communicability|
|77||Playfulness||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Communicability|
|73||Patterns of use of hotel chatbots||Real chatbot||Log of conversations||Quantitative; Qualitative||Communicability|
|Proactivity||Benefits||[B1] to provide additional information||89 131 8|
|[B2] to inspire users and to keep the conversation alive||8 24 123 127 118|
|[B3] to recover from a failure||102 123|
|[B4] to improve conversation productivity||8 66|
|[B5] to guide and engage users||83 127 43 61 49 135 129|
|Challenges||[C1] timing and relevance||102 24 76 123|
|[C3] users’ perception of being controlled||127 135|
|Strategies||[S1] leveraging conversational context||8 24 122 42|
|[S2] select a topic randomly||102|
|Conscientiousness||Benefits||[B1] to keep the conversation on track||16 41 66 9|
|[B2] to demonstrate understanding||43 41 66 9 54 118|
|[B3] to hold a continuous conversation||66 54 26 89|
|Challenges||[C1] to handle task complexity||41 43 54|
|[C2] to harden the conversation||41 66 127|
|[C3] to keep the user aware of the chatbot’s context||65 66 54|
|Strategies||[S1] conversational flow||41 9 54|
|[S2] visual elements||41 66 127|
|[S3] confirmation messages||65 9 54 41|
|Communicability||Benefits||[B1] to unveil functionalities||138 66 77 73|
|[B2] to manage the users’ expectations||138 41 66 77|
|Challenges||[C1] to provide business integration||66 54|
|[C2] to keep visual elements consistent with textual inputs||138|
|Strategies||[S1] to clarify the purpose of the chatbot||138 66 54|
|[S2] to advertise the functionality and suggest the next step||138 66|
|[S3] to provide a help functionality||66 77 138|
a.2.2. Social Intelligence
|Study||Main investigation||Interaction||Analyzed data||Methods||Reported social characteristics|
|66||First-time users experience||Real chatbot||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Damage control; Manners; Personalization|
|34||Anthropomorphism||Real chatbot||Log of conversations||Qualitative||Damage control|
|123||Iterative prototyping||Real chatbot||Log of conversations||Quantitative; Qualitative||Damage control|
|83||Conversational decision-making||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Damage control; Manners|
|135||Task management chatbot design||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Damage control; Manners; Personalization|
|54||Chatbots design principles||None||Literature review||Qualitative||Damage control; Thoroughness|
|41||Personalization||Real chatbot||Questionnaires; Think aloud; Interviews||Quantitative; Qualitative||Damage control; Thoroughness; Personalization|
|77||Playfulness||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Damage control; Manners|
|73||Patterns of use of hotel chatbots||Real chatbot||Log of conversations||Quantitative; Qualitative||Damage control; Personalization|
|142||Social intelligence||WoZ||Log of conversations||Qualitative||Damage control; Manners; Emotional intelligence|
|67||Users’ expectations and experience||Real chatbot; WoZ||Log of conversations; Questionnaires; Subjective feedback||Quantitative; Qualitative||Damage control; Thoroughness; Manners; Emotional intelligence|
|29||Sexual verbal abuse||Real chatbot||Log of conversations;||Quantitative||Damage Control|
|89||Naturalness||Real chatbot||Log of conversations; Interviews; Questionnaires||Quantitative; Qualitative||Thoroughness; Manners|
|62||Communication changes with human or chatbot partners||Real chatbot||Log of conversations;||Quantitative||Thoroughness|
|26||Language capabilities||Real chatbot||Log of conversations||Qualitative||Thoroughness|
|69||Naturalness||Real chatbot||Log of conversations; Interviews; Questionnaires||Quantitative; Qualitative||Thoroughness; Manners|
|131||Personality traits||WoZ||Log of conversations; Focus group discussion; Interviews;||Qualitative||Thoroughness; Emotional intelligence; Personalization|
|79||Expressing personality through language||None||Automatically generated utterances; Questionnaires||Quantitative||Thoroughness|
|24||Sequential coherence||WoZ||Log of conversations; Think aloud; Interviews||Quantitative; Qualitative||Thoroughness; Manners|
|88||Believability||None||Not evaluated||Not evaluated||Thoroughness; Emotional intelligence|
|122||Emotional engagement||Real chatbot||Log of conversations||Qualitative||Moral agency; Emotional intelligence; Personalization|
|80||Racial stereotypes||Real chatbot||Log of conversations||Qualitative||Moral agency|
|33||Gender affordances||Real chatbot||Log of conversations||Qualitative||Moral agency|
|11||Perceived moral agency||Video chatbot||Questionnaires||Quantitative||Moral gency|
|15||Gender affordances||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Moral agency|
|71||Socially capable chatbot||Real chatbot||Log of conversations; Questionnaires||Quantitative||Emotional intelligence; Manners|
|40||Thought-evoking dialogues||Real chatbot||Log of conversations; Questionnaires||Quantitative||Emotional intelligence|
|49||Conversational mental health care||Real chatbot||Questionnaires||Quantitative||Emotional intelligence|
|9||Communication strategies and affective backchannels||Real chatbot||Questionnaires||Quantitative||Emotional intelligence|
|87||Mental health care||Real chatbot||Log of conversations||Quantitative||Emotional intelligence|
|64||Self-disclosure||WoZ||Log of conversations; Questionnaires||Quantitative; Qualitative||Emotional intelligence|
|102||Emotional engagement||Real chatbot; WoZ||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Emotional intelligence; Personalization|
|127||Ethnographic data collection||Real chatbot||Log of conversations; Interviews||Qualitative||Thoroughness; Personalization|
|76||Social-agent orientation; Proactivity||Real chatbot||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Personalization|
|42||Privacy and ethics||WoZ||Workshop outcomes; Interviews;||Qualitative||Personalization|
|150||Users’ expectations and experiences||Real chatbot||Subjective feedback||Qualitative||Thoroughness; Emotional intelligence; Personalization|
|Damage control||Benefits||[B1] to appropriately respond to harassment||73 29|
|[B2] to deal with testing||142 123 77 66|
|[B3] to deal with lack of knowledge||142 66 135 123 54 83|
|Challenges||[C1] to deal with unfriendly users||123 83 34|
|[C2] to identify abusive utterances||29|
|[C3] to balance emotional reactions||142 29|
|Strategies||[S1] emotional reactions||142 29 123 34|
|[S2] authoritative reactions||142 67 135 123|
|[S3] to ignore the user’s utterance and change the topic||142 123|
|[S4] conscientiousness and communicability||123 142 41 66 54|
|[S5] to predict users’ satisfaction||77|
|Thoroughness||Benefits||[B1] to adapt the language dynamically||79 41 131 67 54 62 89|
|[B2] to exhibit believable behavior||67 79 89 26 88 127|
|Challenges||[C1] to decide on how much to talk||67 150 54 24 41|
|[C2] to be consistent||41 69|
|Manners||Benefits||[B1] to increase human-likeness||67 89 69 135|
|Challenges||[C1] to deal with face-threatening acts||142 83|
|[C2] to end a conversation gracefully||66 24|
|Strategies||[S1] to engage in small talk||77 66 71|
|[S2] to adhere turn-taking protocols||135|
|Moral agency||Benefits||[B1] to avoid stereotyping||80 117 15 33|
|[B2] to enrich interpersonal relationships||11 122|
|Challenges||[C1] to avoid alienation||33 117|
|[C2] to build unbiased training data and algorithms||117 122|
|Emotional Intelligence||Benefits||[B1] to enrich interpersonal relationships||71 142 40 74 64 9 49 150 87|
|[B2] to increase engagement||40 122 102|
|[B3] to increase believability||88|
|Challenges||[C1] to regulate affective reactions||71 67 131 64|
|Strategies||[S1] to use social-emotional utterances||71 9|
|[S2] to manifest conscientiousness||122 102|
|[S3] reciprocity and self-disclosure||40 74|
|Personalization||Benefits||[B1] to enrich interpersonal relationships||42 41 95 122 102|
|[B2] to provide unique services||41 127 135 76 131 66|
|[B3] to reduce interactional breakdowns||73 41 67|
|Challenges||[C1] privacy||42 150 95 131|
|Strategies||[S1] to learn from and about the user||95 131 122 150|
|[S2] to provide customizable agents||76 42 131|
|[S3] visual elements||127|
|Study||Main investigation||Interaction||Analyzed data||Methods||Reported social characteristics|
|66||First-time users experience||Real chatbot||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Personality|
|123||Iterative prototyping||Real chatbot||Log of conversations||Quantitative; Qualitative||Identity|
|135||Task management chatbot design||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Identity|
|54||Chatbots design principles||None||Literature review||Qualitative||Identity|
|77||Playfulness||Real chatbot||Log of conversations; Questionnaires||Quantitative; Qualitative||Identity|
|67||Users’ expectations and experience||Real chatbot; WoZ||Log of conversations; Questionnaires; Subjective feedback||Quantitative; Qualitative||Identity|
|131||Personality traits||WoZ||Log of conversations; Focus group discussion; Interviews;||Qualitative||Personality|
|79||Expressing personality thorough language||None||Automatically generated utterances; Questionnaires||Quantitative||Personality|
|88||Believability||None||Not evaluated||Not evaluated||Personality|
|122||Emotional engagement||Real chatbot||Log of conversations||Qualitative||Personality|
|80||Racial stereotypes||Real chatbot||Log of conversations||Qualitative||Identity|
|33||Gender affordances||Real chatbot||Log of conversations||Qualitative||Identity|
|71||Socially capable chatbot||Real chatbot||Log of conversations; Questionnaires||Quantitative||Personality|
|9||Communication strategies and affective backchannels||Real chatbot||Questionnaires||Quantitative||Personality|
|102||Emotional engagement||Real chatbot; WoZ||Log of conversations; Questionnaires; Interviews||Quantitative; Qualitative||Personality|
|25||Uncanny valley||Real chatbot||Psychophysiological measures; Questionnaires||Quantitative||Identity|
|5||Anthropomorphic clues and agency framing||Real chatbot||Questionnaires||Quantitative||Identity|
|32||Social perception||Real chatbot||Log of conversations||Qualitative||Identity|
|34||Anthropomorphism||Real chatbot||Log of conversations||Qualitative||Identity|
|28||Anthropomorphism and Initiation repairs||Real chatbot||Log of conversations||Quantitative; Qualitative||Identity|
|17||User needs and motivations||None||None||Qualitative||Identity|
|21||Humanness and typefaces||None||Questionnaires; Think aloud||Quantitative; Qualitative||Identity|
|106||Sense of humor||Real chatbot||Questionnaires||Quantitative||Personality|
|86||Sense of humor||None||Literature||Qualitative||Personality|
|16||Users’ motivations||None||Questionnaires; Subjective feedback||Quantitative; Qualitative||Personality|
|124||Personality preferences||WoZ||Log of conversations; Focus group discussion; Questionnaires||Quantitative; Qualitative||Personality|
|Identity||Benefits||[B1] to increase engagement||5 123 77|
|[B2] to increase human-likeness||21 5|
|Challenges||[C1] to avoid negative stereotypes||32 117 15 80 33 15 34 67|
|[C2] to balance the identity and the technical capabilities||28 25 54 17 32|
|Strategies||[S1] to design and elaborate on a persona||77 95 135 131 123 32 34|
|Personality||Benefits||[B1] to exhibit believable behavior||88 79 106 102|
|[B2] to enrich interpersonal relationships||16 66 131 124 122 71 9|
|Challenges||[C1] to adapt humor to the users’ culture||106|
|[C2] to balance the personality traits||131 79 124|
|Strategies||[S1] to use appropriate language||122 88 79 66|
|[S2] to have sense of humor||86 106 131 16 66|
a.3. Measurements of social characteristics
The surveyed literature revealed a number of constructs that are used to measure whether the interaction with the chatbot reaches the intended social goals. In general terms, task-oriented interactions focus on completing the task, while the general purpose chatbots aim to engage users in general conversations. In both cases, engagement performs an important role, and therefore is a commonly used metric. However, we also found that social characteristics can be measured looking at additional constructs, which include interpersonal relationship, social presence, social influence, and anthropomorphism. In this section, we discuss each of these constructs and the characteristics that can influence the measurements.
Engagement relates to attracting and holding the user’s attention and interest 98. In the chatbot domain, engagement can be measured by the number of exchanges per session 122; 40, although other attributes can manifest users’ engagement, such as emotional connection, attention, the perception of time, and self- and external awareness 98. In this survey, we found social characteristics in all the three categories measured in terms of their impact on engagement. Conversational intelligence teaches users how to interact (communicability), demonstrate attention to the users’ intentions and needs (conscientiousness) 118, and encourage users to continue the conversation 129, even after periods of inactivity (proactivity) 49. Personification makes the interaction more fun and enjoyable (personality) 66; 124; 106, while social intelligence provides emotional connection and support (emotional intelligence) 122; 102; 9. In line with the engagement with technology framework 98, usability also came up as influencing engagement, particularly the ease of use 127; 66; 135; 65; 67 and accessibility 128; 41, which can be improved with personalization 66. For example, 128 show how adapting the visual interface (color scheme, input mode, background images, and amount of textual information) changes the user’s experience when the interlocutors are children in early childhood education.
Interpersonal relationship relates to building a social connection with the chatbot that relies on trust, intimacy, common ground, and reciprocal enjoyment 74. We found that social intelligent and appropriately personified chatbots are more likely to build an interpersonal relationship to the user. In 74, 64, and 40, the authors showed that emotional intelligence potentially helps with building trust, intimacy, and enjoyment, which influences the willingness to engage. 42 and 41 argue that reducing privacy and security concerns also increases trust. Hence, personalizing the information stored by the chatbot and transparency result in higher interpersonal relationship and consequently willingness to engage. In addition, 11 showed that perceived moral agency also correlates with higher trustworthiness and goodwill. Regarding personification, 34 argue that improper personality and identity representations may lead to confusing, disempowering, and distracting the users, ultimately raising interpersonal conflicts.
Interpersonal relationship is a consequence of social presence 121; 59. In CMC fields, social presence describes the degree of salience of an interlocutor 121, in this case, the chatbot, and how it can project itself as an individual. As a determinant of interpersonal relationship, social presence is also influenced by intimacy and trust; however, social presence is also assessed as how much the chatbot was considered to be a “real” person 25, where humanness and believability are influencing factors. In this sense, personification may drive the creation of social presence, since it increases the perception of anthropomorphic clues 32; 25. However, anthropomorphic clues by themselves do not imply social presence. For example, 5 did not find a main effect on social presence of anthropomorphic clues, such as having a human name (identity) and language style (thoroughness). On the other hand, they found that framing the chatbot as “intelligent” slightly increased social presence, and higher social presence resulted in higher emotional connection with the company represented by the chatbot. Hence, social and conversational intelligence are also required to increase social presence, most likely due to the potential elevation of the chatbot’s social positioning 142. For instance, in 127, participants who complained about the chatbot’s handcrafted responses expressed the desire for spontaneous (thoroughness) and somewhat emotional reactions (emotional intelligence) to their inputs, so the chatbot would be “more like a person.” Participants in both 66 and 102 related human-likeness to the ability to hold meaningful conversations, which include context preservation (conscientiousness) and timing (proactivity). In addition, 118 showed that increasing the relevance of the chatbot’s utterance (conscientiousness) increases social presence and perceived humanness. 89 list a number of characteristics that increases chatbot’s believability, including manners, proactivity, damage control, conscientiousness, and personality. These align with the dimensions of social presence theory in CMC 136.
Anthropomorphism, in its turn, is a process of attributing of human traits to a non-human entity, even when this attribution is known to be inappropriate 92; for example, referring a chatbot with a personal pronoun (he/she) rather than “it.” Anthropomorphism can be induced by personification 92; 5; 32 since the human traits are explicitly attributed by the designer. Characteristics as manners 127 and emotional intelligence 102 were also shown to trigger anthropomorphism 77, although it may depend on the user’s tendency to anthropomorphize 76.
Finally, a chatbot’s social influence refers to its capacity to promote changes in the user’s cognition, attitude, or behavior 109, which is sometimes called persuasiveness 91. Although we did not find studies that focus on formally measuring the social influence of chatbots, the surveyed literature revealed a few instances of chatbots changing users’ behaviors in particular domains. For example, in health, 49 showed that a chatbot with proactivity and emotional intelligence can motivate users to engage in a self-help program for students who self-identify as experiencing symptoms of anxiety and depression. In education, tutor chatbots proactive interventions (APT moves) that helped students to increase participation in group discussions 61; 43; 129. In the customer services field, 5 evaluated whether anthropomorphic clues and framing changes the users’ attitude toward the company being represented by the chatbot; however, they did not find a significant effect. Although social influence has shown to increase with higher social presence levels in CMC fields (e.g., see 103), the impact of enriching chatbots with social characteristics is still under-investigated.
a.4. Related Surveys
Table 18 shows the social characteristics covered by each related survey, where the content in each cell represents how the paper refers to the social characteristic.
|Proactivity||-||social intelligence, users’ control||-|
|Conscientiousness||guiding the users through the topics||chatbot’s conversational flows, chatbot’s memory, making changes on the fly, conversational and situational knowledge||maintain the theme and respond specific questions|
|Communicability||-||chatbot’s help, documentation||-|
|Damage control||-||-||damage control|
|Thoroughness||chatbot’s language; user’s recognition and recall||appropriate linguistic register/accuracy|
|Manners||handling small talk||-||-|
|Moral agency||-||-||respect, inclusion, and preservation of dignity, ethics and cultural knowledge of users|
|Emotional intelligence||-||social intelligence||provide emotional information, be warm, adapt to the human’s mood|
|Personalization||-||social intelligence; ethics regarding privacy (data retention and transparency)||meets neurodiverse needs|
|Identity||-||-||transparent to inspection and discloses its identity|
|Personality||personality||personality||personality, fun, humor|