Impacts of Personal Characteristics on User Trust in Conversational Recommender Systems

03/24/2022
by   Wanling Cai, et al.
Hong Kong Baptist University
0

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding items through conversations and have recently gained increasing attention in domains such as media and e-commerce. Like in human communication, building trust in human-agent communication is essential given its significant influence on user behavior. However, inspiring user trust in CRSs with a "one-size-fits-all" design is difficult, as individual users may have their own expectations for conversational interactions (e.g., who, user or system, takes the initiative), which are potentially related to their personal characteristics. In this study, we investigated the impacts of three personal characteristics, namely personality traits, trust propensity, and domain knowledge, on user trust in two types of text-based CRSs, i.e., user-initiative and mixed-initiative. Our between-subjects user study (N=148) revealed that users' trust propensity and domain knowledge positively influenced their trust in CRSs, and that users with high conscientiousness tended to trust the mixed-initiative system.

READ FULL TEXT VIEW PDF
04/02/2019

Mirroring to Build Trust in Digital Assistants

We describe experiments towards building a conversational digital assist...
08/20/2020

The Role of Domain Expertise in User Trust and the Impact of First Impressions with Intelligent Systems

Domain-specific intelligent systems are meant to help system users in th...
05/03/2022

Analyzing and Simulating User Utterance Reformulation in Conversational Recommender Systems

User simulation has been a cost-effective technique for evaluating conve...
04/12/2021

A Conceptual Framework for Establishing Trust in Real World Intelligent Systems

Intelligent information systems that contain emergent elements often enc...
05/24/2022

VIRATrustData: A Trust-Annotated Corpus of Human-Chatbot Conversations About COVID-19 Vaccines

Public trust in medical information is crucial for successful applicatio...
11/26/2021

Evaluating Trust in the Context of Conversational Information Systems for new users of the Internet

Most online information sources are text-based and in Western Languages ...
07/08/2021

Privacy Concerns in Chatbot Interactions: When to Trust and When to Worry

Through advances in their conversational abilities, chatbots have starte...

1. Introduction

Conversational recommender systems (CRSs) imitate human advisors to assist users in finding desired items through multi-turn conversations and have been attracting increasing attention in recent years for developing task-oriented chatbots (Christakopoulou et al., 2016; Cai and Chen, 2020; Jannach et al., 2021). Some commercial chatbots have been built on Facebook Messenger or Amazon Alexa for recommending items (e.g., songs, movies, and products) (Sun and Zhang, 2018). Unlike traditional recommender systems that mainly present one-shot recommendations (e.g., a ranked list of items) to users (Ricci et al., 2015), CRSs can support mixed-initiative (combining both user-initiative and system-initiative (Allen et al., 1999)) interactions between users and the system (Cai and Chen, 2020). In such a system, users can not only actively inform the system of their preferences (e.g., “I want relaxing music.”), but also accept proactive suggestions from the system (e.g., “Do you want to try some piano music?”) (Jannach et al., 2021). Recent works have shown that mixed-initiative CRSs can help users better control the recommendation (Jin et al., 2019) and facilitate their exploration (Cai et al., 2021).

However, few studies have investigated the influence of the conversational interaction – particularly the initiative strategy (i.e., who, user or system, takes the initiative in the conversation) – on user trust in CRSs (Jannach et al., 2021). Given that user trust plays a vital role in users’ willingness to accept recommendations (Berkovsky et al., 2017) and adopt a given system (Benbasat and Wang, 2005; Chen and Pu, 2005), which can be inherently affected by users’ personal characteristics (such as personality traits) (Knijnenburg et al., 2011; Millecamp et al., 2018), this work aims to identify whether and how user characteristics and system conversation design factors affect user trust in text-based CRSs. Our findings will be useful for optimizing the design of CRSs to be more trustworthy for individual users, which may potentially maximize the benefit of CRSs.

Our work is theoretically driven by the three-layered trust model proposed by Hoff and Bashir (Hoff and Bashir, 2015), which suggests that user trust in a computer system can be influenced by three types of factors: user-related factors, system-related factors, and context-related factors. Among user-related factors, inspired by previous works (Cho et al., 2016; Zhou et al., 2020; Knijnenburg et al., 2011), we considered three personal characteristics: (1) personality traits, which refer to enduring characteristics related to people’s thinking, feeling, and behaving, and have been shown to influence user trust in both human-human and human-machine relationships (Cho et al., 2016; Zhou et al., 2020); (2) trust propensity, which can be defined as the user’s general tendency to trust others, and has been demonstrated to impact user trust in traditional recommender systems (Chen and Pu, 2005; Knijnenburg et al., 2011); and (3) domain knowledge, which represents the user’s expert knowledge in the choice domain, and has been shown to influence user reliance and trust in intelligent systems (Hoff and Bashir, 2015; Sanchez et al., 2014).

Figure 1. Our research questions.

Figure 1 shows our research questions. The first question is “How do personal characteristics (personality, trust propensity, domain knowledge) affect user trust in CRSs?” The second question is “How do personal characteristics and initiative strategy interact to affect user trust in CRSs?” The third question is “How do personal characteristics and task complexity interact to affect user trust in CRSs?”

Among system-related factors, we considered initiative strategies, among which the mixed-initiative strategy is a special characteristic of CRSs. However, it might be difficult to inspire user trust in CRSs with a “one-size-fits-all” design of the initiative strategy, because different users may prefer different conversation interactions given their personal characteristics (Knijnenburg et al., 2011; Freitag and Bauer, 2016). Thus, we investigated whether and how users’ personal characteristics influence their trust in CRSs with different initiative strategies (user-initiative vs. mixed-initiative).

Finally, regarding context-related factors (also known as situational factors such as the user’s performed task (Knijnenburg et al., 2012; Hoff and Bashir, 2015)), we examined whether and how users’ personal characteristics interact with task complexity to affect user trust in CRSs. User tasks with different levels of complexity may influence how users interact with systems (Byström and Järvelin, 1995; Myers et al., 2019), and the influences vary among different types of users (Buckland and Florian, 1991; Lai and Hung, 2012; Rheu et al., 2021); for example, domain novices tend to spend much more time completing a more complex knowledge-seeking task than do domain experts (Lai and Hung, 2012).

To summarize, we aim to answer the following three research questions in this work (see Figure 1):

RQ1: How do personal characteristics (personality, trust propensity, domain knowledge) affect user trust in CRSs?

RQ2: How do personal characteristics and initiative strategy interact111Here, “interaction” is a statistical term. An interaction between A and B to affect Z indicates that A influences Z depending on B, or B influences Z depending on A (Kutner et al., 2005). to affect user trust in CRSs?

RQ3: How do personal characteristics and task complexity interact to affect user trust in CRSs?

To answer our research questions, we conducted a between-subjects user study (N=148). Two variants of a text-based conversational music recommender were implemented for the experiment: a User-Initiative system that mainly responds to users’ requests or feedback, and a Mixed-Initiative system that not only allows users to freely give feedback on the recommendation, but also proactively offers suggestions. Additionally, to vary the task complexity, we designed two user tasks in the context of seeking recommendations: a Simple Task that asks users to find five songs based on their current preferences, and a Complex Task that asks users to first explore diverse types of songs beyond their current interests and then select five songs.

Our analyses revealed three main findings: (1) User experience with conversational interaction in a CRS can be influential on user trust in the system; (2) Among the three types of personal characteristics considered, users’ trust propensity and domain knowledge significantly affected user trust toward the CRS; (3) The personality trait conscientiousness separately interacted with the initiative strategy and the task complexity to inspire user trust in the CRS. Based on these findings, we present in this paper practical implications for designing trustworthy CRSs that can be tailored to individual users’ needs based on their personal characteristics (e.g., conscientiousness, trust propensity, and domain knowledge

). We believe this work contributes to the research on conversational Artificial Intelligence (AI) systems, and will facilitate improved CRS design by integrating personalization.

2. Related Work

2.1. Conversational Recommender Systems

Conversational recommender systems (CRSs) aim to mimic a human advisor to assist users in looking for recommendations in a multi-turn dialogue via text or voice (Jannach et al., 2021; Christakopoulou et al., 2016; Zhang et al., 2018; Yang et al., 2018), and have been applied in several domains, such as movies (Cai and Chen, 2020), music (Jin et al., 2019; Cai et al., 2021), and e-commerce (Zhang et al., 2018). Unlike single-shot traditional recommender systems (Ricci et al., 2015), CRSs allow users to interact with the system in a multi-turn conversation, enabling the system to incrementally refine the user preference model to generate more satisfying recommendations (Jugovac and Jannach, 2017; Cai and Chen, 2020). Such systems can support mixed-initiative interaction by combining both user-initiative (i.e., users actively tell the system what they want) and system-initiative (i.e., the system proactively offers suggestions to users during the recommendation process) interactions, which is regarded as a more flexible interaction strategy in human-computer interaction (HCI) (Allen et al., 1999). Several recent studies on CRSs have demonstrated that such systems enable more natural interactions between the user and the system, which can better enhance user experience with recommender systems (Jannach et al., 2021; Jin et al., 2019; Cai et al., 2021; Narducci et al., 2020).

With increased interest in CRSs, recommender system researchers have been focusing on improving CRS efficiency (i.e., reducing the number of dialogue turns) and effectiveness (i.e., improving the recommendation quality) (Jannach et al., 2021; Gao et al., 2021; Zhang et al., 2018). Although conversational system design is a trending topic within the HCI community, few studies have investigated conversational interaction designs for recommender systems (Jin et al., 2019; Cai et al., 2021; Peng et al., 2019; Jannach et al., 2021). For instance, one study compared two critiquing-based conversational music recommenders that employed different initiative strategies (Jin et al., 2019), and found that the user-initiative CRS gives users more control to tune recommendations on their own, whereas the mixed-initiative CRS guides users to discover more diverse recommendations. Another recent study demonstrated the ability of a CRS to promote user exploration activities (Cai et al., 2021), and suggested that the mixed-initiative CRS enhances user exploration by allowing users to control the exploration direction on their own as well as guiding them to explore something different. While existing studies have demonstrated several advantages of CRSs, work on the critical factor of user trust, which strongly determines users’ intention to adopt CRSs in real-world situations, is limited. Thus, this study aims to investigate the factors that may affect user trust in CRSs.

2.2. Trust in Human-Computer Interaction and Recommender Systems

Trust is an important factor in both human-human and human-computer relationships (Freitag and Bauer, 2016; Marsh and Dibben, 2003; Lee and See, 2004), which has been studied for a long period. Trust is defined in various ways in the existing HCI literature (McKnight et al., 1998; Lee and Turban, 2001; Wang and Emurian, 2005), but a common theme is that trust can be regarded as a behavioral intention (e.g., intention to use) or “trusting intention” (McKnight et al., 1998). Studies have suggested three types of factors that can influence trust: user-related, system-related, and context-related factors. These three types of factors respectively correspond to the three layers of trust model proposed by Hoff and Bashir (Hoff and Bashir, 2015): Dispositional trust refers to the user’s general tendency to trust systems, which may arise from individual characteristics such as personality (user-related); learned trust represents the user’s evaluations of a system’s trustworthiness drawn from past interactions (system-related); situational trust is based on the context of the user-system interaction, such as the complexity of the performed task and user workload (context-related). Motivated by the three-layered trust model (Hoff and Bashir, 2015), we are interested in examining these three types of factors (user-related, system-related and context-related) that may influence user trust in CRSs.

Trust-related issues have also gained a lot of attention in recommender systems (RSs), because user trust highly influences users’ willingness to use a system and follow its recommendations in their decision-making process (O’Donovan and Smyth, 2005; Komiak and Benbasat, 2006; Wang and Benbasat, 2007; Knijnenburg et al., 2011). User trust in a technological artifact (e.g., recommender system) is often based on competence (i.e., the system’s ability to assist users in a specific task), benevolence (i.e., the system’s qualities such as security and reputation), and integrity (i.e., the system’s reliability and honesty) (McKnight et al., 1998). Studies on RSs have demonstrated that users’ perceived competence of the system positively influences their trust in the system (Kunkel et al., 2019; Chen and Pu, 2005). For example, the accuracy and diversity of recommendation lists tend to improve user trust and increase customer purchases in the e-commerce domain (Panniello et al., 2016). Moreover, the organization-based recommendation interface was demonstrated to reduce user effort in the decision-making process, sustain user trust, and increase users’ intention to use the system (Chen and Pu, 2005). Recommendations accompanied by explanations that provide information to assist users in making judgments on the recommended item have also been shown to increase user trust and decision confidence (Tintarev and Masthoff, 2015; Chen and Pu, 2005).

Literature on user trust in RSs has mostly focused on the aspect of recommendations (Panniello et al., 2016; Kunkel et al., 2019, 2019), whereas, to the best of our knowledge, user trust in the context of conversational recommendations has rarely been investigated. In CRSs, the conversational interaction between users and the system usually mimics human communication, suggesting that user trust toward the system is similar to trust in interpersonal relationships. Thus, to improve trust, the system should be both reliable in performing the requested tasks and predictable in interactions (i.e., behaving as expected by the user) (Rheu et al., 2021). However, individual users may have their own expectations of interaction strategies (e.g., preference for user-initiative or mixed-initiative) depending on their individual characteristics, which may influence their trust in the system. To facilitate the design of trustworthy CRSs that can serve individual users’ needs, our work focuses on investigating the impact of personal characteristics on user trust in CRSs that employ different initiative strategies.

2.3. Personal Characteristics

Because previous HCI and RS studies have indicated that user trust in the human-system relationship depends on individual characteristics (Cho et al., 2016; Zhou et al., 2020; Hoff and Bashir, 2015), we believe that user trust in CRSs may also be influenced by users’ personal characteristics. The literature suggests that three personal characteristics, namely personality traits, trust propensity and domain knowledge, are likely to affect user trust in conversational recommenders.

Personality Traits. Personality is defined as individual differences in one’s enduring way of thinking, feeling, and behaving (Kazdin, 2000; McCrae and John, 1992). The Big-Five personality model, which comprises five traits – openness to experience (openness), conscientiousness, extroversion, agreeableness, and neuroticism – is widely used to assess user personality (McCrae and John, 1992). Studies have reported the impacts of personality traits on trust in interpersonal relationships (Freitag and Bauer, 2016), demonstrating that openness and conscientiousness affect trust in both friends and strangers, and agreeableness affects trust in strangers. Personality traits also influence user trust in the human-machine collaboration (Cho et al., 2016; Zhou et al., 2020); for example, people who are more agreeable and conscientious are more likely to trust automation in decision-making (Cho et al., 2016). Thus, we speculate that personality traits (such as agreeableness and conscientiousness) can also influence user trust toward system guidance in CRSs.

Trust Propensity. Trust propensity is defined as the general tendency to trust others (Rotter, 1971; Colquitt et al., 2007) and is viewed as a dynamic individual difference that may be affected by personality type as well as situational factors (e.g., cultural background) (Mayer et al., 1995). Trust literature has shown that a user’s trust propensity influences the formation of trust toward specific technological systems (McKnight et al., 1998; Mcknight et al., 2011). When deciding whether to trust a system, users tend to look for cues that signify the system’s trustworthiness; however, the perception of the signals is affected by their trust propensity (Lee and Turban, 2001). Thus, we seek to determine whether this characteristic will impact user trust in CRSs.

Domain Knowledge. Domain knowledge refers to a person’s expert knowledge in a specific field. HCI research has demonstrated that users’ domain knowledge can influence their interaction behaviors and preferred interaction strategies (Nourani et al., 2020). In recommender systems, domain experts prefer more control during the decision-making process (Knijnenburg et al., 2011), whereas domain novices tend to perceive recommendations without too much control to be more helpful. Moreover, users’ reliance on decision support systems is related to their domain knowledge; for example, users with little or no specialized domain knowledge are likely to rely on the system’s suggestions (Bussone et al., 2015). Thus, we believe that domain knowledge may influence the way users prefer to interact with CRSs (e.g., preference for user-initiative or mixed-initiative), hence affecting user trust.

3. User Experiment

Figure 2. Interfaces of two text-based conversational music recommenders employing different initiative strategies (User-Initiative [left] and Mixed-Initiative [right]), and user tasks with low and high complexity (Simple Task and Complex Task [middle]) in our 2 2 between-subjects study.

Figure 2 shows the interfaces of two text-based conversational music recommenders employing different initiative strategies. The left side of the figure shows the interface of the User-Initiative system, and the right side of the figure shows the interface of the Mixed-Initiative system. The middle shows two user tasks, simple task and complex task in our 2 2 between-subjects study.

3.1. Experiment Design

Based on Hoff and Bashir’s three-layered trust model (Hoff and Bashir, 2015), we investigated how user-related factors (personal characteristics) interact with both a system-related factor (initiative strategy) and a context-related factor (task complexity) to influence user trust in CRSs. We deployed two text-based prototype conversational music recommenders that employ different initiative strategies (user-initiative and mixed-initiative(Cai et al., 2021), and designed two user tasks of varying complexity in the recommendation domain. Thus, we designed a 2 (User-Initiative vs. Mixed-Initiative) 2 (Simple Task vs. Complex Task) online between-subjects user study, in which participants were randomly assigned to one of the four experimental conditions (see Figure 2). Below we present two experimental manipulations.

3.1.1. Conversational Recommenders

We used two variants of text-based conversational music recommenders that employ different initiative strategies to support users in looking for music recommendations (Cai et al., 2021):

  • User-Initiative System: This system, which performs reactive system behavior, only responds when users initiate requests during the conversation. In this system, users can post feedback to refine the current recommended item or ask for songs based on music-related attributes (e.g., genres, tempo, and danceability). For example, a user can tune a recommendation by typing “I want higher tempo.

  • Mixed-Initiative System: This system supports both user-initiative and system-initiative interactions. Specifically, in addition to reactively responding to users’ requests, the system can proactively provide suggestions (e.g., “Compared with the last played song, do you like the song of lower tempo?”) to facilitate users’ music discovery during the recommendation process. As suggested by a study of chatbot proactivity (Peng et al., 2019), our system offers suggestions to users when they make an explicit request (i.e., by clicking the “Let bot suggest” button; Figure 2) or when the system identifies a good time to offer suggestions.222According to our pilot test observations, it is reasonable for the system to provide suggestions when the user has consecutively skipped three songs or listened to five songs.

Although conversational systems can employ three types of initiative strategies, namely user-initiative, system-initiative, and mixed-initiative strategies, we did not employ a purely system-initiative strategy in our study because this design relies on a “system asks, user responds” conversation paradigm (Zhang et al., 2018), which can restrict user interaction, reduce flexibility, and make users feel passive (Jannach et al., 2021; Jurafsky and Martin, 2000).

Figure 2 shows the user interfaces of the two conversational music recommenders, and the dialogue windows show the conversation between the user and the system. Each recommended song is displayed on a card using which the user can control music playback, along with a set of buttons under the card for the user to give feedback. Specifically, the user can click the “Like” button to add the current song into their playlist where they can rate the song, and the “Next” button to skip the current song. In the Mixed-Initiative system, the user can click the “Let bot suggest” button to trigger the system’s suggestion based on the currently recommended song. Additionally, the user can send a message in natural language about the music genre, audio feature, or artist to provide feedback on the currently recommended song and accordingly refine the recommendation. We used a popular natural language understanding platform, DialogFlow,333https://cloud.google.com/dialogflow/es/docs and a widely used online music service, Spotify API,444https://developer.spotify.com/documentation/web-api to develop our conversational music recommenders. For the generation of the system-initiative suggestions, we employed the progressive system-suggested critiquing technique designed by Cai et al. (Cai et al., 2021), which considers the user’s song preferences as well as incremental feedback captured from past interactions.

3.1.2. User Tasks

To determine whether and how users’ personal characteristics interact with the context-related factor (task complexity) to influence user trust in CRSs, we considered two typical user tasks in the recommendation domain:

  • Simple Task. Users are asked to interact with our conversational music recommender (called “music chatbot” in our study) to find five songs that suit their preferences, and rate each song in terms of its pleasant surprise.

  • Complex Task. Users are asked to complete two steps: (1) use our music chatbot to discover songs as many different music genres as possible, create a playlist containing 20 songs that fit their tastes, and then rate each song in terms of its pleasant surprise; and (2) select their top-5 most preferred songs from the playlist they created. Compared with the simple task, this task requires users to discover more types of music and make comparisons for selecting their most preferred songs, which is more cognitively demanding.

3.2. Participants

We recruited participants from Prolific,555https://www.prolific.co/ a popular platform for academic surveys (Peer et al., 2017). To ensure experiment quality, we pre-screened users in Prolific using the following criteria: (1) participants should be fluent in English; (2) they must have more than 100 previous submissions; (3) their approval rate should be greater than 95%. The experiment took 25 minutes to complete on average. We compensated each participant £2.40 on successfully completing the experiment. The Research Ethics Committee (REC) of the authors’ university approved this study.

In total, 194 users participated in our study. We removed the responses of 23 participants because of their excessively long experiment completion time (outliers). We excluded the responses of another 23 participants who failed the attention check questions.

666To ensure the quality of user responses, we set three attention checking questions (e.g.,“Please indicate which of the following items is not fruit?”). Thus, the remaining responses of 148 participants were included in the analyses [User-Initiative: Simple Task (32), Complex Task (35); Mixed-Initiative: Simple Task (45), Complex Task (36); Gender: female (70), male (75), other (3); Age: 19-25 (69), 26-30 (27), 31-35 (25), 36-40 (10), 41-50 (11), ¿ 50 (6)]. Participants were mainly from the United Kingdom (32), the United States (32), Portugal (18), Poland (12), and Italy (9).

3.3. Experimental Procedure

Participants had to accept a general data protection regulation consent form before they signed into our system using their Spotify accounts. After reading the user study instructions, participants were asked to fill out a pre-study questionnaire, which included demographic questions and questions for measuring their personal characteristics (see Section 3.4). To ensure that participants understood the study task and how to use the conversational recommender, they were given a tutorial of interacting with the assigned conversational music recommender, followed by two minutes to try the system. After completing the tutorial, participants were asked to complete a randomly assigned task (Simple Task or Complex Task as described in Section 3.1.2). After finishing the task, participants were asked to fill out a post-study questionnaire regarding their trust-related perception of the conversational music recommender (see Section 3.5).

3.4. Pre-Study Questionnaire

In the pre-study questionnaire, we used a short personality test, the Ten Item Personality Inventory (TIPI) (Gosling et al., 2003), to assess participants’ Big-Five personality traits: openness to experience, conscientiousness, extroversion, agreeableness, and neuroticism. Each personality trait is assessed by two questions in the TIPI, and the personality value for each trait is the average of the scores on the two questions. To measure participants’ trust propensity, we adopted two statements developed by Lee and Turban (Lee and Turban, 2001): “I tend to trust the recommender, even though having little knowledge of it.” and “Trusting someone or something is difficult.” Because our system was built for the music domain, we used the nine statements from the “Active Musical Engagement” facet of Goldsmiths Musical Sophistication Index (Müllensiefen et al., 2014) to assess participants’ musical sophistication as their domain knowledge. All statements were rated on a 7-point Likert scale from 1 (strongly disagree) to 7 (strongly agree). In Table 1, we briefly introduce each measured personal characteristic.

Table 2

shows the descriptive statistics of our participants’ personal characteristics (PCs). The scored values are centered between 3 and 5 for almost all PCs, and the standard deviations are comparable across all PCs. Table 

3 shows Pearson’s correlations between these PCs; these correlations (e.g., trust propensity is positively related to extroversion and agreeableness) are generally consistent with the results of previous literature (Greenberg et al., 2015; John et al., 1999; Freitag and Bauer, 2016).

Personal Characteristic (PC) Description
Big-Five Personality Traits (Gosling et al., 2003; Greenberg et al., 2015)
Openness to Experience (O) This trait, also called Openness, is related to one’s cognitive style, distinguishing
creative, imaginative people (high O) from down-to-earth, conventional people (low O).
Conscientiousness (C) This trait is associated with one’s way of controlling, regulating, and directing impulses,
distinguishing prudent people (high C) from impulsive people (low C).
Extroversion (E)

This trait concerns the active level of engagement with the external world, distinguishing


sociable, outgoing people (high E) from reserved, quiet people (low E).
Agreeableness (A) This trait reflects one’s attitude toward cooperation and social harmony, distinguishing
cooperative, sympathetic people (high A) from critical, tough people (low A).
Neuroticism (N) This trait describes one’s tendency to experience negative feelings, distinguishing
sensitive, easily upset people (high N) from calm, unflappable people (low N).
Trust Propensity (TP) (Lee and Turban, 2001) TP reflects one’s general willingness to trust other people or technologies.
People with high TP are naturally inclined to trust others, while people with low TP are hesitant.
Musical Sophistication (MS) (Müllensiefen et al., 2014) MS is related to one’s ability to successfully engage with music. People with high MS are
more flexible in responding to a great range of musical situations than are people with low MS.
Table 1. Description of Big-Five personality traits, trust propensity, and domain knowledge (musical sophistication)
PC Min Median Mean Max S.D.
O 2.00 5.00 5.01 7.00 1.15
C 2.00 5.25 5.19 7.00 1.19
E 1.00 3.25 3.29 7.00 1.54
A 2.00 5.00 4.94 7.00 1.10
N 1.00 3.50 3.51 6.50 1.53
TP 1.00 4.00 4.05 6.50 0.99
MS 1.44 4.22 4.25 6.89 1.03
Table 2. Descriptive statistics of participants’ personal characteristics (PCs)
PC O C E A N TP MS
O - *** ** *** *** ***
C 0.2858 - *** *** *
E 0.2189 0.0894 - ** ** *
A 0.2920 0.3321 0.1518 - *** ***
N -0.3112 -0.3277 -0.2375 -0.3954 -
TP 0.1419 0.1854 0.2668 0.2729 -0.1509 -
MS 0.2875 0.0702 0.2086 0.0213 0.0326 0.0916 -
Significance: *** ¡ .001, ** ¡ .01, * ¡ .05.
Table 3. Pearson’s correlations between the Big-Five personality traits, trust propensity, and musical sophistication

3.5. Trust Measurement

In the post-study questionnaire, we measured users’ trust-related perception of the conversational music recommender in two main dimensions: Competence Perception and User Trust. Competence Perception refers to how users perceive the system’s competence in assisting them in performing tasks, which contains the following three constructs derived from prior works (Chen and Pu, 2006; Knijnenburg et al., 2012; Walker et al., 1997):

  • Perceived Recommendation Quality: This construct measures the system’s ability to provide good recommendations to help users make decisions or support their exploration. Users may judge the quality of recommendations in terms of several aspects, e.g., accuracy, novelty, and serendipity (Chen and Pu, 2006; Knijnenburg et al., 2012). A previous study showed that users’ perceived recommendation quality influences their perceived usefulness of the system in helping them accomplish tasks, which consequently impacts user trust toward the system (Chen and Pu, 2006). Thus, we considered this construct and measured it using questions from ResQue (Chen and Pu, 2006), a widely used user-centric evaluation framework for recommender systems.

  • Perceived Conversational Interaction: This construct measures the system’s ability to effectively communicate with users to perform tasks during the interaction. Several aspects of conversational interaction are deemed crucial to CRSs (Jin et al., 2021), which include understandability, perceived control, interaction adequacy (i.e., ability to elicit and refine preferences (Chen and Pu, 2006)), and naturalness of the dialogue interaction. Because communication is the primary way people develop trust within interpersonal relationships (de Vries et al., 2013)

    , we hypothesize that users’ experience with conversational interaction will also influence the formation of user trust in the system. We measured this construct by adopting questions mainly from an evaluation framework for conversational agents 

    (Walker et al., 1997).

  • Perceived Effort: This construct measures users’ perceived difficulty or ease in using the system for completing their tasks, which can reflect the effectiveness of the system in supporting users to accomplish tasks. When users perceive high effort in using the system to complete tasks, they may feel frustrated and show less trust (Chen and Pu, 2006, 2005). We used questions in ResQue (Chen and Pu, 2006) to measure this construct.

The User Trust dimension directly measures user trust in the CRS based on two constructs, each measured using one question item: Perceived Trust assesses users’ overall feelings of trust toward the conversational recommender, and Intention to Use measures users’ willingness to use the system in the future.

We assessed the validity of our constructs as measured by the question items (19 items in the initial questionnaire) by conducting confirmatory factor analysis (CFA) with R library Lavvan.777http://lavaan.ugent.be/ In CFA, the items within the same scale are represented by a latent factor, where the loading of each item denotes how strongly that item is associated with the corresponding factor. We iteratively removed 5 items with low loadings (¡0.50) or high cross-loadings, leaving behind 14 items in total (Table 4). All items were assessed by 7-point Likert scale from 1 (strongly disagree) to 7 (strongly agree). Each factor had good internal consistency (Cronbach’s

¿ 0.80), composite reliability (CR ¿ 0.80), and convergent validity [Average Variance Extracted (AVE) ¿ 0.50] 

(Ab Hamid et al., 2017), and the loading of each item exceeded the acceptable level of 0.50, with an overall good model fit (Hu and Bentler, 1999): (51) = 86.283, ¡ .001; Root Mean Square Error of Approximation (RMSEA) = 0.068, Comparative Fit Index (CFI) = 0.967, Turker-Lewis Index (TLI) = 0.957.

Construct Item (each statement is rated on a 7-point Likert scale) Loadings
Competence Perception
Perceived Recommendation Quality (Cronbach alpha: 0.9001; CR: 0.8951; AVE: 0.6647)
The music chatbot helped me discover new songs. 0.7940
The songs recommended to me were novel. 0.5378
The music chatbot provided me with recommendations that I had not considered in the first place
     but turned out to be a positive and surprising discovery.
0.8457
The music chatbot provided me with surprising recommendations that helped me discover new songs
     that I wouldn’t have found elsewhere.
0.9226
The music chatbot provided me with recommendations that were a pleasant surprise to me
     because I would not have discovered them somewhere else.
0.8728
Perceived Conversational Interaction (Cronbach alpha: 0.8668; CR: 0.8692; AVE: 0.5756)
I found the music chatbot easy to understand in this conversation. 0.7590
The music chatbot worked the way I expected it to in this conversation. 0.7950
I found it easy to inform the music chatbot if I dislike/like the recommended song. 0.6967
I felt in control of modifying my taste using this music chatbot. 0.7995
In this conversation, I knew what I could say or do at each point of the dialog. 0.7236
Perceived Effort (Cronbach alpha: 0.8712; CR: 0.8730; AVE: 0.7729)
Looking for a song using this interface required too much effort. 0.8675
I easily found the songs I was looking for. (reversed) 0.8927
User Trust
Perceived Trust This music chatbot can be trusted.
Intention to Use I will use this music chatbot again.
Table 4. Post-study questionnaire for measuring users’ trust-related perception of the conversational recommender

4. Analyses & Results

The three-layered trust model (Hoff and Bashir, 2015) indicates three types of factors that may influence user trust: user-related, system-related, and context-related factors. We conducted a series of analyses to investigate the influences of these factors on users’ trust-related perception of CRSs. First, we examined the relationship between Competence Perception and User Trust, and the impacts of user-related factors (i.e., the three personal characteristics) on these two dimensions (RQ1). For this purpose, we used structural equation modeling (SEM) to build a path model to test and evaluate multivariate causal relationships among the constructs in Table 4 and the effects of personal characteristics in an integrative structure.

Next, we investigated in-depth the impacts of personal characteristics to determine whether and how user-related factors interact with the system-related factor (initiative strategy) and the context-related factor (task complexity) to influence Competence Perception and User Trust (RQ2 & RQ3). As it is relatively complicated to perform interaction effect analyses with multiple factors using SEM (Henseler and Chin, 2010)

, we conducted an additional set of linear regression analyses to investigate the interaction effects.

Figure 3. Structural equation modeling (SEM) results. Two personality traits (conscientiousness and extroversion) influenced User Trust via Competence Perception, and trust propensity and musical sophistication directly affected User Trust. The numbers on the arrows represent the

coefficient and standard error (in parentheses) of the effect. Significance: ***

¡ .001, ** ¡ .01, * ¡ .05. is the proportion of variance explained by the model. Factors are scaled to have a standard deviation of 1.

Figure 3 shows the results of structural equation modeling. The results indicate that two personality traits, conscientiousness and extroversion, influenced User Trust via Competence Perception, and trust propensity and musical sophistication directly affected User Trust. The numbers on the arrows represent the coefficient and standard error (in parentheses) of the effect.

Perceived Perceived
Recommendation Quality Conversational Interaction Perceived Effort Perceived Trust Intention to Use
Coef. (S.E.) Coef. (S.E.) Coef. (S.E.) Coef. (S.E.) Coef. (S.E.)
Mixed Initiative vs. User Initiative 0.408 (0.229) . -0.064 (0.147) -0.183 (0.235) 0.054 (0.190) -0.003 (0.259)
Complex Task vs. Simple Task 0.222 (0.228) -0.290 (0.146) * 0.391 (0.234) . -0.055 (0.189) -0.408 (0.258)
Openness 0.086 (0.199) -0.053 (0.128) -0.073 (0.205) 0.076 (0.165) -0.143 (0.225)
Conscientiousness -0.038 (0.182) 0.133 (0.117) 0.051 (0.187) -0.055 (0.151) 0.054 (0.207)
Extroversion -0.099 (0.149) -0.092 (0.095) 0.050 (0.153) 0.165 (0.123) -0.086 (0.168)
Agreeableness 0.131 (0.199) -0.151 (0.128) 0.078 (0.205) -0.088 (0.165) 0.162 (0.226)
Neuroticism 0.141 (0.153) -0.069 (0.098) -0.220 (0.157) 0.136 (0.127) -0.004 (0.173)
Trust Propensity 0.189 (0.248) 0.076 (0.159) -0.122 (0.255) 0.029 (0.206) 0.212 (0.281)
Musical Sophistication 0.782 (0.221) *** 0.189 (0.142) -0.399 (0.227) . 0.280 (0.183) 0.620 (0.250) *
Mixed Initiative x Openness -0.013 (0.230) 0.241 (0.148) -0.218 (0.237) 0.066 (0.191) 0.128 (0.261)
Mixed Initiative x Conscientiousness 0.652 (0.208) ** 0.284 (0.134) * -0.388 (0.214) . 0.372 (0.173) * 0.405 (0.236) .
Mixed Initiative x Extroversion 0.067 (0.173) -0.064 (0.111) -0.061 (0.178) -0.239 (0.143) . 0.130 (0.196)
Mixed Initiative x Agreeableness -0.327 (0.239) 0.167 (0.154) -0.196 (0.246) -0.016 (0.198) 0.047 (0.271)
Mixed Initiative x Neuroticism -0.049 (0.175) 0.175 (0.113) 0.002 (0.180) -0.087 (0.145) 0.179 (0.199)
Mixed Initiative x Trust Propensity -0.224 (0.261) -0.074 (0.167) 0.199 (0.268) -0.040 (0.216) -0.034 (0.296)
Mixed Initiative x Musical Sophistication -0.496 (0.236) * -0.143 (0.151) 0.377 (0.243) 0.069 (0.196) 0.010 (0.267)
Complex Task x Openness -0.169 (0.221) -0.104 (0.142) 0.324 (0.227) -0.055 (0.183) -0.002 (0.250)
Complex Task x Conscientiousness -0.526 (0.211) * -0.210 (0.136) 0.184 (0.217) 0.005 (0.175) -0.364 (0.239)
Complex Task x Extroversion 0.205 (0.165) 0.041 (0.106) 0.050 (0.170) -0.076 (0.137) 0.047 (0.187)
Complex Task x Agreeableness 0.354 (0.240) 0.151 (0.154) -0.206 (0.247) 0.304 (0.199) 0.009 (0.272)
Complex Task x Neuroticism -0.048 (0.167) -0.008 (0.107) 0.140 (0.172) 0.042 (0.138) -0.020 (0.189)
Complex Task x Trust Propensity 0.337 (0.260) 0.326 (0.167) . -0.545 (0.267) * 0.220 (0.215) 0.596 (0.294) *
Complex Task x Musical Sophistication -0.524 (0.242) * 0.052 (0.155) -0.064 (0.249) -0.190 (0.200) -0.417 (0.274)
Constant 3.948 (0.207) *** 5.920 (0.133) *** 2.840 (0.213) *** 5.421 (0.172) *** 5.054 (0.235) ***
0.314 0.314 0.243 0.249 0.301
Adjusted 0.186 0.187 0.102 0.110 0.171
Given that interaction effects are present in our regression models, we only interpret the interaction effects (highlighted in bold) because the interpretation of the main effects
(i.e., the effect of one independent variable on the dependent variable) is incomplete or misleading (Kutner et al., 2005).
Significance: *** ¡ .001, ** ¡ .01, * ¡ .05, . ¡ .1; Coef. stands for coefficient; S.E. stands for standard error.
Table 5.

Regression models for estimating the interaction effects of personal characteristics with initiative strategy and task complexity on users’ trust-related perception constructs (as shown in Table 

4) in the conversational recommender

4.1. User Trust in Conversational Recommender Systems

Figure 3 illustrates the results of the structural equation modeling (SEM) analysis, showing all significant paths in our model. The SEM model had overall good model fit indices: (123) = 182.312, p ¡ .001; RMSEA = 0.057, CFI = 0.956, TLI = 0.947, which meet the recommended SEM fit standard.888Hu and Bentler (Hu and Bentler, 1999) suggest good values for the following indices: CFI ¿ .96, TLI ¿ .95, and RMSEA ¡ .05.

In the resulting model, the paths between the perception constructs (inside black rectangles) show how users’ perceptions of the system’s competence influenced their trust in the CRS. Specifically, the significant paths (Perceived Recommendation Quality Perceived Trust and Intention to Use; Perceived Conversational Interaction Perceived Trust and Intention to Use) justify the positive effects of users’ competence perception of the CRS on their trust in the CRS. Furthermore, the path coefficients indicate that Perceived Trust was affected more by Perceived Conversational Interaction (coefficient = 0.695) than Perceived Recommendation Quality (coefficient = 0.235). Our model also verifies the positive effect of Perceived Trust on Intention to Use (Pu et al., 2011). Additionally, we observed an interesting path (Perceived Conversational Interaction Perceived Effort Perceived Recommendation Quality), showing that users’ perceptions of conversational interaction positively influenced their perceptions of the recommendation quality, which were mediated by their perceived effort. These effects highlight the importance of considering Perceived Conversational Interaction for inspiring user trust in CRSs.

Moreover, our SEM model shows how personal characteristics influence the constructs of Competence Perception and User Trust. The results indicate that two personality traits (conscientiousness and extroversion) influenced User Trust via Competence Perception, whereas trust propensity and domain knowledge (musical sophistication) directly affected User Trust in the CRS.

  • Conscientiousness. The trait conscientiousness positively influenced users’ perceptions of conversational interaction: users with higher conscientiousness tended to have a better perception of their interaction with the conversational recommender.

  • Extroversion. The trait extroversion was positively related to users’ perceived recommendation quality. Users with higher extroversion tended to perceive higher system competence in recommending satisfying songs. One possible explanation is that compared with introverted users, extroverted users (who are more outgoing and vigorous (John et al., 1999)) are more willing to take risks and try listening to different music during the interaction, hence improving their perceptions of recommendations.

  • Trust Propensity. Trust propensity positively affected users’ perceptions of the conversational interaction and their intention to use. Namely, users who are more willing to trust others tended to enjoy the conversational interaction with the CRS and have a higher intention to use it again. People with a higher trust propensity (who tend to believe others are sincere and have good intentions (Colquitt et al., 2007)) may be more cooperative (Jacquet et al., 2019) with the system during the conversation, resulting in a more positive conversational experience.

  • Musical Sophistication. Regarding the influence of domain knowledge, we found that musical sophistication positively influenced users’ intention to use the CRS, suggesting that users with higher musical sophistication are more likely to use the conversational recommender in the future.

In addition to the user-related factors (personal characteristics), we investigated whether the system-related factor (initiative strategy) and the context-related factor (task complexity) directly influenced user trust in the model. Among these factors, task complexity negatively affected users’ perceived conversation interaction ( ¡ .05), which may be attributed to the increased user effort required to perform a complex task.

4.2. Interaction Effects on User Trust

(a) Interaction effect of conscientiousness and initiative strategy on Perceived Recommendation Quality.
(b) Interaction effect of conscientiousness and initiative strategy on Perceived Conversational Interaction.
(c) Interaction effect of conscientiousness and initiative strategy on Perceived Trust.
(d) Interaction effect of musical sophistication and initiative strategy on Perceived Recommendation Quality.
Figure 4. Interaction effects between personal characteristics and initiative strategy on users’ trust-related perception constructs. (a-c) Conscientiousness (C): Users with higher C tended to have a better perception and showed more trust in the Mixed-Initiative system. (d) Musical Sophistication (MS): Users with higher MS tended to perceive higher recommendation quality from the User-Initiative system.
(a) Interaction effect of conscientiousness and task complexity on Perceived Recommendation Quality.
(b) Interaction effect of trust propensity and task complexity on Perceived Effort.
(c) Interaction effect of trust propensity and task complexity on Intention to Use.
(d) Interaction effect of musical sophistication and task complexity on Perceived Recommendation Quality.
Figure 5. Interaction effects between personal characteristics and task complexity on users’ trust-related perception constructs. (a) Conscientiousness (C): C showed a positive effect on the users’ perceived recommendation quality for the Simple Task. (b-c) Trust Propensity (TP): The effects of TP on users’ trust-related perception were stronger for the Complex Task. (d) Musical Sophistication (MS): Users with higher MS tended to have a better perception of recommendations for the Simple Task.

As inspired by previous studies (Knijnenburg et al., 2011; Myers et al., 2019), individual users may have different perceptions of the two conversational recommenders (User-Initiative and Mixed-Initiative systems), and may show different attitudes when performing the two user tasks (Simple Task and Complex Task), which may influence their formation of trust in the CRS. Therefore, we investigated how the user-related factors (personal characteristics) interact with the system-related factor (initiative strategy) and the context-related factor (task complexity) to influence user trust in the CRS. Specifically, we used linear regression models to process the mix of numerical and categorical independent variables, namely personal characteristics, initiative strategy and task complexity as the independent variables, and the five trust-related perception constructs (Table 4) as the dependent variables. Table 5 presents the results of the regression models that show how users’ trust-related perception is influenced by personal characteristics, initiative strategy, and task complexity, revealing their interaction effects (represented by interaction terms in the model). We report coefficients, standard errors, p-values, and adjusted values.

4.2.1. Interaction Effects between Personal Characteristics and Initiative Strategy, Task Complexity

We detected a significant three-way interaction effect between the trait agreeableness, initiative strategy and task complexity on users’ perceived conversational interaction. Specifically, when using the Mixed-Initiative system to accomplish the Complex Task, users’ agreeableness positively affected their perceptions of the conversation interaction ( = 0.40,

¡ .05, 95% confidence interval [CI]: [0.08, 0.64]).

999Here we conducted Spearman’s correlation analyses after detecting interaction effects to clearly show the relationship between a personal characteristic and a user perception construct in a particular condition. We followed this procedure to analyze all the detected interaction effects. In other words, system-initiative suggestions help users explore music, and users with higher agreeableness are likely to have a better experience with such conversational interaction. However, no significant correlations were detected in the other three experimental conditions.

4.2.2. Interaction Effects between Personal Characteristics and Initiative Strategy

Table 5 shows significant interaction effects between initiative strategy and the two personal characteristics, conscientiousness and musical sophistication:

  • Conscientiousness. The models in Table 5 show significant interaction effects between the trait conscientiousness and initiative strategy on several trust-related perception constructs, including perceived recommendation quality, perceived conversational interaction, and perceived trust. Figures 4(a),  4(b) and 4(c) visualize these interaction effects. In the Mixed-Initiative system, users’ conscientiousness levels positively influenced their perceived recommendation quality ( = 0.36, ¡ .001, 95% CI: [0.15, 0.53]), perceived conversational interaction ( = 0.41, ¡ .001, 95% CI: [0.21, 0.57]), and perceived trust ( = 0.39, ¡ .001, 95% CI: [0.18, 0.56]). In contrast, in the User-Initiative system, the trait conscientiousness was not correlated with users’ trust-related perception. Conscientious users may be more cautious and consider more choices when making a decision (John et al., 1999), so they may be more inclined to appreciate the suggestions offered by the Mixed-Initiative system that can guide them to discover more music when finding songs of interest.

  • Musical Sophistication. As for domain knowledge, we detected an interaction effect between musical sophistication and initiative strategy on users’ perceived recommendation quality. As illustrated in Figure 4(d), we can see that users with higher musical sophistication tended to have a better perception of recommendations in the User-Initiative system ( = 0.21, ¡ .1, 95% CI: [-0.03, 0.43]), whereas in the Mixed-Initiative System, the level of musical sophistication did not have a significant influence. We also observed that users of lower musical sophistication tended to perceive higher recommendations quality in the Mixed-Initiative system than in the User-Initiative system, implying that the system’s suggestions are more helpful for domain novices.

Interaction Effect with Interaction Effect with
Personal Characteristic Direct Effect Initiative Strategy Task Complexity
Big-Five Personality Traits
Conscientiousness (+):
Perceived Conversational Interaction
(+) in Mixed-Initiative:
Perceived Recommendation Quality;
Perceived Conversational Interaction;
Perceived Trust
(+) in Simple Task:
Perceived Recommendation Quality
Extroversion (+):
Perceived Recommendation Quality
Agreeableness (+) in Mixed-Initiative & Complex Task: Perceived Conversational Interaction
Trust Propensity (+):
Perceived Conversation Interaction;
Intention to Use
(-) in Complex Task:
Perceived Effort
(+) in Complex Task ¿ Simple Task:
Intention to Use
Music Sophistication (+):
Intention to Use
(+) in User-Initiative:
Perceived Recommendation Quality
(+) in Simple Task:
Perceived Recommendation Quality
Table 6. Summary of the major findings. The positive sign (+) and the negative sign (-) indicate significant positive effects and negative effects, respectively

4.2.3. Interaction Effects between Personal Characteristics and Task Complexity

From Table 5, significant interaction effects were detected between task complexity and three personal characteristics, conscientiousness, trust propensity, and musical sophistication:

  • Conscientiousness. We found a significant interaction effect between the trait conscientiousness and task complexity on users’ perceived recommendation quality. As visualized in Figure 5(a), the positive effect of conscientiousness on perceived recommendation quality was observed when users perform the Simple Task ( = 0.32, ¡ .01, 95% CI: [0.10, 0.51]), but no relationship was found for the Complex Task. Together with the results in Figure 4(a), a crossover interaction effect was observed between conscientiousness and initiative strategy, suggesting that when users perform the Complex Task, their conscientiousness levels may differently influence their perceived recommendation quality, depending on the system’s initiative strategy (user-initiative or mixed-initiative).

  • Trust Propensity. Task complexity influenced the effects of trust propensity on users’ perceived effort and intention to use the conversational recommender. Specifically, users with higher trust propensity levels tended to feel less effort using the conversational recommender to perform the Complex Task ( = -0.34, ¡ .01, 95% CI: [-0.53, -0.12]), but the correlation between them was not obvious regarding the Simple Task [see Figure 5(b)

    ], probably due to the intrinsically lower user effort required for the Simple Task. Moreover,

    trust propensity positively influenced users’ intention to use the conversational recommender (also shown in Figure 3), and Figure 5(c) shows that the positive effect was stronger when users performed the Complex Task ( = 0.32, ¡ .01, 95% CI: [0.09, 0.52]) than the Simple Task ( = 0.27, ¡ .05, 95% CI: [0.04, 0.46]).

  • Musical Sophistication. A significant interaction effect was detected between musical sophistication and task complexity on users’ perceived recommendation quality. As shown in Figure 5(d), when performing the Simple Task, users with higher musical sophistication tended to have a more positive perception of recommendations than users with lower musical sophistication ( = 0.34, ¡ .01, 95% CI: [0.12, 0.52]), which could be due to the higher skill levels of music professionals for tuning recommendations to find songs that suit their tastes.

Table 6 summarizes the effects of the three personal characteristics on user trust toward the conversational music recommenders and their interaction effects with the initiative strategy (User-Initiative and Mixed-Initiative) and with the task complexity (Simple Task and Complex Task). Overall, trust propensity and musical sophistication directly influenced users’ intention to use, and conscientiousness interacted with the initiative strategy to influence users’ perceived trust in the CRS.

5. Discussion and Design Implications

In this research, we have sought to better understand user trust in conversational recommender systems (CRSs). By examining the relationships between users’ perceptions of system competence (especially recommendation quality and conversational interaction) and their trust, we found that users’ experience with conversational interaction was particularly important for inspiring user trust toward the conversational recommender (high coefficients for the significant paths, as shown in Figure 3). As driven by the three-layered trust model (Hoff and Bashir, 2015), we investigated the influences of three types of factors (user-related, system-related, and context-related) on user trust in CRSs, in which we highlight the impacts of user-related factors (users’ Big-Five personality traits, trust propensity, and domain knowledge). This section will discuss the key findings of our study and their implications for designing trustworthy CRSs.

5.1. Key Findings

Key Finding #1: Users with higher conscientiousness have a better perception of system competence and show more trust toward the Mixed-Initiative system. Our results demonstrate that users with a higher level of conscientiousness have more positive perceptions in terms of both recommendations and conversational interaction with the Mixed-Initiative system, engendering higher trust in the CRS [see Figures 4(a),  4(b) and 4(c)]. This finding is in line with previous studies showing that more conscientious people have higher trust in automation when conducting decision-making tasks (Chien et al., 2016; Cho et al., 2016). Highly conscientious users tend to be cautious, responsible (John et al., 1999), and may have maximising tendencies (i.e., the tendency to explore and compare alternatives, and look for the best option) (Miceli et al., 2018), which may result in more appreciation for the suggestions from the system that may help them become more informed to make a confident decision. This finding also suggests that individual differences in users’ decision-making style, i.e., maximizing (examining more alternatives to select the best option) and satisficing (settling for a good-enough option) (Schwartz et al., 2002; Jugovac et al., 2018), may be influential on user trust in CRSs, which can be investigated in future research.

Design Implications: Trustworthy CRS design should consider users’ personality traits, especially conscientiousness. For users with higher conscientiousness who like to carefully consider all facets before making a choice, the Mixed-Initiative system that supports both user-initiative and system-initiative interactions is more desirable. System-initiated guidance may support conscientious users in seeking alternatives and finding the “perfect” items from recommendations, hence fostering user trust toward the system. However, for users with lower conscientiousness, the level of system-initiative can be relatively lower because those users tend to be casual and impulsive and might not appreciate extensive guidance from the system.

Key Finding #2: Users’ trust propensity positively influences user trust in conversational recommenders, but the degree of influence is affected by the task complexity. Our results imply the positive effects of trust propensity on users’ perceptions of the conversational interaction and their intention to use, which is consistent with previous reports of the positive effect of one’s general tendency to trust others or technology on trust in recommender systems (Chen and Pu, 2005; Wang and Benbasat, 2007). Moreover, the complexity of the performed task tends to strengthen this effect [see Figures 5(b) and 5(c)], suggesting a stronger influence of trust propensity when users perform the Complex Task. We found that users with higher trust propensity perceived much less effort and higher intention to use the system than users with lower trust propensity, but this trend was more significant for the Complex Task than the Simple Task. We argue that, when performing a complex task, users with higher trust propensity are more likely to take advantage of an effective conversational interaction to indicate what they like or dislike and obtain system guidance when they get stuck on a task. However, as shown in our model (Figure 3), users with lower trust propensity benefit less from conversational interaction, which has a strong influence on user trust (in terms of both perceived conversational interaction and intention to use).

Design Implications: CRS researchers have attempted to improve recommendation quality and conversation interaction to build user trust in the system. However, previous studies have not adapted the design of trustworthy CRSs to users’ trust propensity. The “one size fits all” approach can be flawed because it assumes all users have the same trust propensity level. Thus, future design of CRSs could also consider users’ general tendency to trust technology. For example, the system may help users with lower trust propensity understand more about the system’s ability and guide them to accomplish simple tasks in the initial period, which would improve their initial trust in the system’s competence.

Key Finding #3: Users with stronger domain knowledge have a higher intention to use conversational recommenders and prefer to explore recommendations by themselves. Our results indicate that users with more domain knowledge (i.e., higher musical sophistication in our case) have a higher intention to use the CRS. Furthermore, users with a higher level of domain knowledge benefit more from the conversational interaction with recommendations, because they possess a greater ability to articulate their preferences than do domain novices (Jin et al., 2018). In addition, system-initiative suggestions are more helpful for users with less domain knowledge when looking for recommendations. In contrast, domain-knowledgeable users tended to have a better perception in finding recommendations by themselves, probably because this type of user desires more control over their decisions (Knijnenburg et al., 2011).

Design Implications: This finding informs that the users’ domain knowledge level should be taken into account in the design of CRSs, because it influences users’ intention to use the system as well as their preferred initiative strategies. For example, the Mixed-Initiative system is more beneficial for novice users as they may need more suggestions from the system to find recommendations that fit their interests. In contrast, the User-Initiative system might be sufficient for domain experts because they often expect higher control over the interaction with the system and to be interrupted less by the system-initiated guidance.

5.2. Limitations

Before concluding this paper, we highlight some limitations of our research. First, the factors that influence user trust in conversational systems are not limited to Competence Perception, which was the only dimension investigated in our study. Anthropomorphism (Seeger and Heinzl, 2018), security and privacy (Følstad et al., 2018) are additional relevant dimensions of user trust. However, these dimensions are frequently discussed in the context of user trust in customer service chatbots and are influenced by additional personal characteristics, such as affective states (Airenti, 2018) and privacy concerns(Saglam et al., 2021). To avoid added complexity, our trust model mainly considers the dimension of Competence Perception of CRSs, namely, perceived recommendation quality, perceived effort, and perceived conversational interaction. Second, recommender systems are applied in various domains including media, e-commerce, and healthcare. However, we conducted our study with a CRS designed only for music recommendations, which may limit the generalizability of our findings to other domains. In light of differences in user involvement levels (Chen and Pu, 2012), user trust is more crucial in certain domains, such as e-commerce and healthcare. Future work will validate our findings in different CRS application domains. Third, we only considered a text-based CRS for this investigation, and the results may differ when users interact with a voice-based CRS. In future work, we plan to investigate whether our results are applicable to the voice-based CRS.

6. Conclusions

This study investigated the effects of the three types of factors (user-related, system-related and context-related) on user trust, grounded on the framework of Hoff and Bashir’s three-layered trust model (Hoff and Bashir, 2015). Our study demonstrated the main effects of user-related factors (personal characteristics) and their interaction effects with the system-related factor (initiative strategy) and the context-related factor (task complexity) on user trust in conversational recommender systems (CRSs). Our findings indicate that trust propensity and domain knowledge directly influence user trust. Moreover, personal characteristics, like conscientiousness and domain knowledge, can exert influences on user trust in CRSs with different initiative strategies (user-initiative and mixed-initiative).

Prior work on user trust toward traditional recommender systems (Chen and Pu, 2005; Berkovsky et al., 2017) has highlighted the significance of measuring competence perception based on recommendation quality, whereas we emphasize the importance of gauging perceived conversational interaction because it has a stronger influence on user trust in CRSs. As the initiative strategy influences the way users interact with the CRS, we also highlight the interaction effects of personal characteristics and initiative strategy on user trust. Our findings contribute to the research community of Human-AI interactions (Amershi et al., 2019) and will be of interest to researchers who investigate the role of personalization in building user trust in conversational AI systems and the impacts of personal characteristics when developing trustworthy AI systems such as CRSs.

Acknowledgements.
The work was supported by Hong Kong Research Grants Council (RGC/HKBU12201620) and Hong Kong Baptist University IRCMS Project (IRCMS/19-20/D05). We also thank all participants for their time in taking part in our experiment and reviewers for their constructive comments on our paper.

References

  • M. Ab Hamid, W. Sami, and M. M. Sidek (2017) Discriminant validity assessment: use of fornell & larcker criterion versus htmt criterion. In Journal of Physics: Conference Series, Vol. 890, pp. 012163. Cited by: §3.5.
  • G. Airenti (2018) The development of anthropomorphism in interaction: intersubjectivity, imagination, and theory of mind. Frontiers in psychology 9, pp. 2136. Cited by: §5.2.
  • J.E. Allen, C.I. Guinn, and E. Horvtz (1999) Mixed-initiative interaction. IEEE Intelligent Systems and their Applications 14 (5), pp. 14–23. Cited by: §1, §2.1.
  • S. Amershi, D. Weld, M. Vorvoreanu, A. Fourney, B. Nushi, P. Collisson, J. Suh, S. Iqbal, P. N. Bennett, K. Inkpen, J. Teevan, R. Kikin-Gil, and E. Horvitz (2019) Guidelines for human-ai interaction. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pp. 1–13. Cited by: §6.
  • I. Benbasat and W. Wang (2005) Trust in and adoption of online recommendation agents. Journal of the association for information systems 6 (3), pp. 4. Cited by: §1.
  • S. Berkovsky, R. Taib, and D. Conway (2017) How to recommend? user trust factors in movie recommender systems. In Proceedings of the 22nd International Conference on Intelligent User Interfaces, IUI ’17, pp. 287–300. Cited by: §1, §6.
  • M. K. Buckland and D. Florian (1991) Expertise, task complexity, and the role of intelligent information systems. Journal of the American Society for Information Science 42 (9), pp. 635–643. Cited by: §1.
  • A. Bussone, S. Stumpf, and D. O’Sullivan (2015) The role of explanations on trust and reliance in clinical decision support systems. In 2015 International Conference on Healthcare Informatics, pp. 160–169. Cited by: §2.3.
  • K. Byström and K. Järvelin (1995) Task complexity affects information seeking and use. Information Processing & Management 31 (2), pp. 191–213. Cited by: §1.
  • W. Cai and L. Chen (2020) Predicting user intents and satisfaction with dialogue-based conversational recommendations. In Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization, UMAP ’20, pp. 33–42. Cited by: §1, §2.1.
  • W. Cai, Y. Jin, and L. Chen (2021) Critiquing for music exploration in conversational recommender systems. In Proceedings of the 26th ACM Conference on Intelligent User Interfaces, IUI ’21, pp. 480–490. Cited by: §1, §2.1, §2.1, §3.1.1, §3.1.1, §3.1.
  • L. Chen and P. Pu (2005) Trust building in recommender agents. In Proceedings of the Workshop on Web Personalization, Recommender Systems and Intelligent User Interfaces at the 2nd International Conference on E-Business and Telecommunication Networks, pp. 135–145. Cited by: §1, §1, §2.2, 3rd item, §5.1, §6.
  • L. Chen and P. Pu (2006) Evaluating critiquing-based recommender agents. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 1, AAAI ’06, pp. 157–162. Cited by: 1st item, 2nd item, 3rd item, §3.5.
  • L. Chen and P. Pu (2012) Critiquing-based recommenders: survey and emerging trends. User Modeling and User-Adapted Interaction 22 (1-2), pp. 125–150. Cited by: §5.2.
  • S. Chien, K. Sycara, J. Liu, and A. Kumru (2016) Relation between trust attitudes toward automation, hofstede’s cultural dimensions, and big five personality traits. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 60, pp. 841–845. Cited by: §5.1.
  • J. Cho, H. Cam, and A. Oltramari (2016) Effect of personality traits on trust and risk to phishing vulnerability: modeling and analysis. In 2016 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), pp. 7–13. Cited by: §1, §2.3, §2.3, §5.1.
  • K. Christakopoulou, F. Radlinski, and K. Hofmann (2016) Towards conversational recommender systems. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 815–824. Cited by: §1, §2.1.
  • J. A. Colquitt, B. A. Scott, and J. A. LePine (2007) Trust, trustworthiness, and trust propensity: a meta-analytic test of their unique relationships with risk taking and job performance.. Journal of applied psychology 92 (4), pp. 909. Cited by: §2.3, 3rd item.
  • R. E. de Vries, A. Bakker-Pieper, F. E. Konings, and B. Schouten (2013) The communication styles inventory (csi) a six-dimensional behavioral model of communication styles and its relation with personality. Communication Research 40 (4), pp. 506–532. Cited by: 2nd item.
  • A. Følstad, C. B. Nordheim, and C. A. Bjørkli (2018) What makes users trust a chatbot for customer service? an exploratory interview study. In International Conference on Internet Science, pp. 194–208. Cited by: §5.2.
  • M. Freitag and P. C. Bauer (2016) Personality traits and the propensity to trust friends and strangers. The Social Science Journal 53 (4), pp. 467–476. Cited by: §1, §2.2, §2.3, §3.4.
  • C. Gao, W. Lei, X. He, M. de Rijke, and T. Chua (2021) Advances and challenges in conversational recommender systems: a survey. AI Open 2, pp. 100–126. Cited by: §2.1.
  • S. D. Gosling, P. J. Rentfrow, and W. B. Swann Jr (2003) A very brief measure of the big-five personality domains. Journal of Research in personality 37 (6), pp. 504–528. Cited by: §3.4, Table 1.
  • D. M. Greenberg, D. Müllensiefen, M. E. Lamb, and P. J. Rentfrow (2015) Personality predicts musical sophistication. Journal of Research in Personality 58, pp. 154–158. Cited by: §3.4, Table 1.
  • J. Henseler and W. W. Chin (2010) A comparison of approaches for the analysis of interaction effects between latent variables using partial least squares path modeling. Structural Equation Modeling 17 (1), pp. 82–109. Cited by: §4.
  • K. A. Hoff and M. Bashir (2015) Trust in automation: integrating empirical evidence on factors that influence trust. Human factors 57 (3), pp. 407–434. Cited by: §1, §1, §2.2, §2.3, §3.1, §4, §5, §6.
  • L. Hu and P. M. Bentler (1999) Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Structural Equation Modeling: A Multidisciplinary Journal 6 (1), pp. 1–55. Cited by: §3.5, footnote 8.
  • B. Jacquet, A. Hullin, J. Baratgin, and F. Jamet (2019) The impact of the gricean maxims of quality, quantity and manner in chatbots. In 2019 International Conference on Information and Digital Technologies (IDT), pp. 180–189. Cited by: 3rd item.
  • D. Jannach, A. Manzoor, W. Cai, and L. Chen (2021) A survey on conversational recommender systems. ACM Computing Surveys (CSUR) 54 (5), pp. 1–36. Cited by: §1, §1, §2.1, §2.1, §3.1.1.
  • Y. Jin, W. Cai, L. Chen, N. N. Htun, and K. Verbert (2019) MusicBot: evaluating critiquing-based music recommenders with conversational interaction. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM ’19, pp. 951–960. Cited by: §1, §2.1, §2.1.
  • Y. Jin, L. Chen, W. Cai, and P. Pu (2021) Key qualities of conversational recommender systems: from users’ perspective. In Proceedings of the 9th International Conference on Human-Agent Interaction, HAI ’21, pp. 93–102. Cited by: 2nd item.
  • Y. Jin, N. Tintarev, and K. Verbert (2018) Effects of individual traits on diversity-aware music recommender user interfaces. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, pp. 291–299. Cited by: §5.1.
  • O. P. John, S. Srivastava, et al. (1999) The big five trait taxonomy: history, measurement, and theoretical perspectives. Handbook of Personality: Theory and Research 2 (1999), pp. 102–138. Cited by: §3.4, 2nd item, 1st item, §5.1.
  • M. Jugovac and D. Jannach (2017) Interacting with recommenders—overview and research directions. ACM Transactions on Interactive Intelligent Systems (TiiS) 7 (3), pp. 1–46. Cited by: §2.1.
  • M. Jugovac, I. Nunes, and D. Jannach (2018) Investigating the decision-making behavior of maximizers and satisficers in the presence of recommendations. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, UMAP ’18, pp. 279–283. Cited by: §5.1.
  • D. Jurafsky and J. H. Martin (2000)

    Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition

    .
    Upper Saddle River, N.J: Prentice Hall. Cited by: §3.1.1.
  • A. E. Kazdin (2000) Encyclopedia of psychology. Vol. 8, American Psychological Association Washington, DC. Cited by: §2.3.
  • B. P. Knijnenburg, N. J. Reijmer, and M. C. Willemsen (2011) Each to his own: how different users call for different interaction methods in recommender systems. In Proceedings of the fifth ACM conference on Recommender systems, RecSys ’11, pp. 141–148. Cited by: §1, §1, §1, §2.2, §2.3, §4.2, §5.1.
  • B. P. Knijnenburg, M. C. Willemsen, Z. Gantner, H. Soncu, and C. Newell (2012) Explaining the user experience of recommender systems. User Modeling and User-Adapted Interaction 22 (4-5), pp. 441–504. Cited by: §1, 1st item, §3.5.
  • S. Y. Komiak and I. Benbasat (2006) The effects of personalization and familiarity on trust and adoption of recommendation agents. MIS quarterly 30 (4), pp. 941–960. Cited by: §2.2.
  • J. Kunkel, T. Donkers, L. Michael, C. Barbu, and J. Ziegler (2019) Let me explain: impact of personal and impersonal explanations on trust in recommender systems. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pp. 1–12. Cited by: §2.2, §2.2.
  • M. H. Kutner, C. J. Nachtsheim, J. Neter, W. Li, et al. (2005) Applied linear statistical models. McGraw-Hill New York. Cited by: Table 5, footnote 1.
  • H. Lai and S. Hung (2012) Influence of user expertise, task complexity and knowledge management support on knowledge seeking strategy and task performance. In Pacific Asia Conference on Information Systems (PACIS), pp. 37. Cited by: §1.
  • J. D. Lee and K. A. See (2004) Trust in automation: designing for appropriate reliance. Human factors 46 (1), pp. 50–80. Cited by: §2.2.
  • M. K. Lee and E. Turban (2001) A trust model for consumer internet shopping. International Journal of Electronic Commerce 6 (1), pp. 75–91. Cited by: §2.2, §2.3, §3.4, Table 1.
  • S. Marsh and M. R. Dibben (2003) The role of trust in information science and technology. Annual Review of Information Science and Technology (ARIST) 37, pp. 465–98. Cited by: §2.2.
  • R. C. Mayer, J. H. Davis, and F. D. Schoorman (1995) An integrative model of organizational trust. Academy of management review 20 (3), pp. 709–734. Cited by: §2.3.
  • R. R. McCrae and O. P. John (1992) An introduction to the five-factor model and its applications. Journal of Personality 60 (2), pp. 175 – 215. Cited by: §2.3.
  • D. H. Mcknight, M. Carter, J. B. Thatcher, and P. F. Clay (2011) Trust in a specific technology: an investigation of its components and measures. ACM Transactions on Management Information Systems (TMIS) 2 (2), pp. 1–25. Cited by: §2.3.
  • D. H. McKnight, L. L. Cummings, and N. L. Chervany (1998) Initial trust formation in new organizational relationships. Academy of Management review 23 (3), pp. 473–490. Cited by: §2.2, §2.2, §2.3.
  • S. Miceli, V. de Palo, L. Monacis, S. Di Nuovo, and M. Sinatra (2018) Do personality traits and self-regulatory processes affect decision-making tendencies?. Australian Journal of Psychology 70 (3), pp. 284–293. Cited by: §5.1.
  • M. Millecamp, N. N. Htun, Y. Jin, and K. Verbert (2018) Controlling spotify recommendations: effects of personal characteristics on music recommender user interfaces. In Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, UMAP ’18, pp. 101–109. Cited by: §1.
  • D. Müllensiefen, B. Gingras, J. Musil, and L. Stewart (2014) The musicality of non-musicians: an index for assessing musical sophistication in the general population. PloS one 9 (2), pp. e89642. Cited by: §3.4, Table 1.
  • C. M. Myers, A. Furqan, and J. Zhu (2019) The impact of user characteristics and preferences on performance with an unfamiliar voice user interface. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pp. 47:1–47:9. Cited by: §1, §4.2.
  • F. Narducci, P. Basile, M. de Gemmis, P. Lops, and G. Semeraro (2020) An investigation on the user interaction modes of conversational recommender systems for the music domain. User Modeling and User-Adapted Interaction 30 (2), pp. 251–284. Cited by: §2.1.
  • M. Nourani, J. King, and E. Ragan (2020) The role of domain expertise in user trust and the impact of first impressions with intelligent systems. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8, pp. 112–121. Cited by: §2.3.
  • J. O’Donovan and B. Smyth (2005) Trust in recommender systems. In Proceedings of the 10th International Conference on Intelligent User Interfaces, IUI ’05, pp. 167–174. Cited by: §2.2.
  • U. Panniello, M. Gorgoglione, and A. Tuzhilin (2016) Research note—in carss we trust: how context-aware recommendations affect customers’ trust and other business performance measures of recommender systems. Information Systems Research 27 (1), pp. 182–196. Cited by: §2.2, §2.2.
  • E. Peer, L. Brandimarte, S. Samat, and A. Acquisti (2017) Beyond the turk: alternative platforms for crowdsourcing behavioral research. Journal of Experimental Social Psychology 70, pp. 153–163. Cited by: §3.2.
  • Z. Peng, Y. Kwon, J. Lu, Z. Wu, and X. Ma (2019) Design and evaluation of service robot’s proactivity in decision-making support process. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, pp. 1–13. Cited by: §2.1, 2nd item.
  • P. Pu, L. Chen, and R. Hu (2011) A user-centric evaluation framework for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems, RecSys ’11, pp. 157–164. Cited by: §4.1.
  • M. Rheu, J. Y. Shin, W. Peng, and J. Huh-Yoo (2021) Systematic review: trust-building factors and implications for conversational agent design. International Journal of Human–Computer Interaction 37 (1), pp. 81–96. Cited by: §1, §2.2.
  • F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor (2015) Recommender systems handbook. 2nd edition, Springer-Verlag. Cited by: §1, §2.1.
  • J. B. Rotter (1971) Generalized expectancies for interpersonal trust.. American Psychologist 26 (5), pp. 443. Cited by: §2.3.
  • R. B. Saglam, J. R. Nurse, and D. Hodges (2021) Privacy concerns in chatbot interactions: when to trust and when to worry. In International Conference on Human-Computer Interaction, pp. 391–399. Cited by: §5.2.
  • J. Sanchez, W. A. Rogers, A. D. Fisk, and E. Rovira (2014) Understanding reliance on automation: effects of error type, error distribution, age and experience. Theoretical Issues in Ergonomics Science 15 (2), pp. 134–160. Cited by: §1.
  • B. Schwartz, A. Ward, J. Monterosso, S. Lyubomirsky, K. White, and D. R. Lehman (2002) Maximizing versus satisficing: happiness is a matter of choice.. Journal of Personality and Social Psychology 83 (5), pp. 1178. Cited by: §5.1.
  • A. Seeger and A. Heinzl (2018) Human versus machine: contingency factors of anthropomorphism as a trust-inducing design strategy for conversational agents. In Information Systems and Neuroscience, pp. 129–139. Cited by: §5.2.
  • Y. Sun and Y. Zhang (2018) Conversational recommender system. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR ’18, pp. 235–244. Cited by: §1.
  • N. Tintarev and J. Masthoff (2015) Explaining recommendations: design and evaluation. In Recommender systems handbook, pp. 353–382. Cited by: §2.2.
  • M. A. Walker, D. J. Litman, C. A. Kamm, and A. Abella (1997) PARADISE: a framework for evaluating spoken dialogue agents. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, ACL ’98/EACL ’98, pp. 271–280. Cited by: 2nd item, §3.5.
  • W. Wang and I. Benbasat (2007) Recommendation agents for electronic commerce: effects of explanation facilities on trusting beliefs. Journal of Management Information Systems 23 (4), pp. 217–246. Cited by: §2.2, §5.1.
  • Y. D. Wang and H. H. Emurian (2005) An overview of online trust: concepts, elements, and implications. Computers in human behavior 21 (1), pp. 105–125. Cited by: §2.2.
  • L. Yang, M. Sobolev, C. Tsangouri, and D. Estrin (2018) Understanding user interactions with podcast recommendations delivered via voice. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys ’18, pp. 190–194. Cited by: §2.1.
  • Y. Zhang, X. Chen, Q. Ai, L. Yang, and W. B. Croft (2018) Towards conversational search and recommendation: system ask, user respond. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM ’18, pp. 177–186. Cited by: §2.1, §2.1, §3.1.1.
  • J. Zhou, S. Luo, and F. Chen (2020) Effects of personality traits on user trust in human–machine collaborations. Journal on Multimodal User Interfaces 14, pp. 387–400. Cited by: §1, §2.3, §2.3.