With children asking Google to buy them more toys (Bologna, 2021), cheating on homework with Alexa (Sparks, 2019), and playing voice-based pranks on parents (Bologna, 2021), conversational agents (CAs) have potential to not only influence children’s play—but also how they grow and develop (Spitale et al., 2020). For instance, researchers theorize that interacting with an agent can change people’s understanding of agency concepts and their Theory of Mind (ToM) (Jaeger et al., 2019; Levin et al., 2013; Spektor-Precel and Mioduser, 2015). Other research has shown engaging with CAs can change people’s behavior (Adler et al., 2016; Corti and Gillespie, 2016; Schuetzler et al., 2018; Hu et al., 2021) and have positive effects on information retention (Beun et al., 2003).
Considering the impact agents have on human understanding and behavior, how prevalent these systems are becoming (Spitale et al., 2020; Research and Markets, 2019), and how opaque their operations can be to humans (Register and Ko, 2020; Druga et al., 2017; Lovato et al., 2019), a growing body of research suggests it is important for people of all ages to understand AI (Long and Magerko, 2020; Touretzky et al., 2019; Register and Ko, 2020). Furthermore, researchers are investigating how to best teach AI literacy concepts to students, including those as young as preschoolers (Williams et al., 2019). For instance, one study leverages 3-5th grade students familiarity with CAs to teach AI literacy concepts (Lin et al., 2020). Other works utilize AI ethics discussions (DiPaola et al., 2020), interactive, collaborative learning environments (Wan et al., 2020), and gesture recognition tools (Zimmermann-Niefield et al., 2020) to engage students in learning AI. In this work, we use a constructionist approach, in which students program their own CAs, to teach AI concepts to 6-12th grade students (Van Brummelen et al., 2021; Papert and Harel, 1991).
Another aspect of AI education research includes students’ perception of AI systems themselves, including personification of such systems, emotions the systems evoke, and students’ conceptions of how the systems work. For example, one study examines preschool- and kindergarten-aged students’ perceptions of “thinking machines” during an AI learning activity, emphasizing the importance of early childhood AI literacy and ToM development (Williams et al., 2019). Other studies investigate children and family’s perceptions of CAs (Lovato et al., 2019), how interaction modalities influence children’s perceptions of CAs (Druga et al., 2017), children’s perceptions of maze-solving agents’ intelligence (Druga et al., 2018), and whether children categorize CAs as animate objects or artifacts (Xu and Warschauer, 2020). Yet other studies emphasize the importance of adults’ perceptions and conceptions of AI, especially in decision-making and policy (Jaeger and Levin, 2016; Large and Burnett, 2019; Shank, 2014). To our knowledge, few studies investigate middle and high school students’ perceptions of AI (Rodríguez-García et al., 2021; Michaelis and Mutlu, 2019), despite teenage years being critical in ethical perspective development (Damon, 2004), a key component of AI literacy (Touretzky et al., 2019). Furthermore, to our knowledge, no studies investigate how middle and high school students’ perceptions of CAs change through programming CAs.
We posit that understanding students’ perceptions and feelings towards such agents can help researchers better facilitate student learning. For instance, feelings of closeness with teachers have been shown to affect students’ academic performance (Birch and Ladd, 1997; Wolter et al., 2014; Al-Yagon and Mikulincer, 2004), which may also be the case when agents take on the teacher role. Another study indicates that the avatar used for pedagogical feedback-giving agents affect students’ emotional attachment and satisfaction with the learning process (Schöbel et al., 2019), alluding to the potential for students’ perception of agents to affect learning. Furthermore, research suggests understanding students’ preconceptions and mental models can improve teaching (Sherin, 2013; Duit, 2009). By understanding students’ feelings and conceptions about agents, we expect we can create better digital learning environments.
This study investigates 6-12th grade students’ perceptions and conceptions of Amazon Alexa in a learning environment described in (Van Brummelen et al., 2021). In contrast to (Van Brummelen et al., 2021), which investigates students’ AI literacy, this study investigates how a programming and learning intervention, in which students develop their own CAs, affects student perspectives of AI. Our main research question is as follows:
RQ: How does building Alexa skills and learning about conversational AI in a remote workshop affect students’ perceptions and conceptions of AI, conversational AI, and Alexa?
By better understanding students’ perspectives on agents and how these perspectives can be changed, we contribute to ongoing research to develop more human-centered, socially useful agents—especially for K-12 education. To this end, we present four design considerations for K-12 education agents and development tools based on our findings. Specifically, we look at students’ conceptions of how AI and conversational AI work, and perceptions of Alexa in terms of friendliness, human-likeness, aliveness, safeness, trustworthiness, intelligence (generally and relative to themselves), and how close they feel to Alexa.
2. Related Work
2.1. Conceptions of artificial intelligence and theory of mind
The working definition of AI in research has changed over the years—from having a sharp focus on logical, symbolic representations of concepts and actions to a marked concentration on modelling extensive interconnected computation machines called “neural networks”(Wooldridge, 2021). In the media, AI has been depicted in many different ways—as killer robots, android caretakers, and superintelligent, disembodied voices (Guadamuz, 2017). Despite the somewhat frivolous portrayals, people’s understanding of AI and how it works has serious implications—from policy-making to day-to-day assessments of whether a self-driving vehicle is safe to trust one’s life with (Jaeger and Levin, 2016; Meschtscherjakov et al., 2018).
ToM research in AI investigates how to develop AI systems with human-like cognition, as well as how people understand AI as agents with mental states (Erb, 2016). In this research, we focus on the latter, or “Theory of Artificial Mind” (ToAM) (Spektor-Precel and Mioduser, 2015). Understanding people’s conceptions of AI, including anthropomorphization of AI technology, conceptions of specific technologies, like CAs, and emotional reactions to AI systems, is important for teaching AI literacy. Through better understanding students’ perceptions and ToAM, we can likely better teach students about AI (Jaeger et al., 2019) and therefore better reach our research community’s goal of equipping people to live in an AI-filled world (Jaeger and Levin, 2016; Touretzky et al., 2019; Long and Magerko, 2020).
Children have been observed to anthropomorphize AI systems (Druga et al., 2018, 2017; Xu and Warschauer, 2020); however, their understanding of the actual “aliveness” of such systems is inconsistent across populations and seems to vary with age (Williams et al., 2019; Scaife and van Duuren, 1995; Kahn Jr et al., 2012; Xu and Warschauer, 2020). Other anthropomorphic aspects of AI systems have also been investigated for different purposes. For instance, a number of studies examine how children (3-10 years old) perceive agents’ intelligence—generally and relative to their intelligence—with the purpose of inspiring critical thinking (Druga et al., 2018, 2017). Another study investigates 5-6 year old children’s perception of CAs’ friendliness, aliveness, trustworthiness, safeness, and funniness, in addition to intelligence to develop CA design recommendations (Lovato et al., 2019). Researchers investigated similar anthropomorphic aspects, including how sociable, mutual-liking, attractive, human, close, and intelligent children (10-12 years old) perceive agents to be, in order to improve learning interventions (Michaelis and Mutlu, 2019). We investigate related anthropomorphic aspects in middle to high school students’ ToAM.
Research also shows that interaction with AI artifacts can influence people’s ToM and perceptions of AI. For instance, observing and constructing robot behavior influenced students’ ToAM, enabling them to better explain the AI systems’ behavior (Spektor-Precel and Mioduser, 2015). Another study showed that interacting with a pedagogical agent influenced students’ understanding of the key ToM concept of agency, allowing them to better predict behavior. The same study linked students’ prior understanding of agency to better learning (Jaeger et al., 2019). In this paper, we investigate how AI literacy workshops involving programming a CA influences students’ ToAM, including perceptions of anthropomorphic qualities and understanding of AI behavior.
2.2. Conversational agents and education
Many studies investigate how CAs can best embody the teaching role (Dincer and Doganay, 2017; Morales-Urrutia et al., 2020; Pérez-Marín and Pascual-Nieto, 2013; Leelawong and Biswas, 2008). Some such studies show that interacting with agents can positively affect learning and students’ ToM (Dincer and Doganay, 2017; Jaeger et al., 2019). In this study, however, we take a constructionist approach, and instead of placing agents in the teaching role, we empower students to learn about AI through developing their own CAs (Papert and Harel, 1991; Van Brummelen et al., 2021).
Constructionism has been shown to be effective in teaching K-12 students AI concepts. For example researchers have taught students AI ethics through constructing paper prototypes (Ali et al., 2019)
, machine learning (ML) concepts through developing gesture-based models(Zimmermann-Niefield et al., 2020), and AI programming concepts through creating projects with AI cloud services (Kahn et al., 2018). Our study teaches students Long and Magerko’s AI literacy competencies through developing CAs (Van Brummelen et al., 2021; Long and Magerko, 2020).
Certain studies specifically investigate whether constructionist activities change student conceptions and perceptions of AI agents. For example, a series of studies showed constructing a robot’s behavior enabled kindergarten students to conceptualize an agent’s rule-based behavior (Mioduser and Levy, 2010), shifted students’ perspectives from technological to psychological (Levy and Mioduser, 2008), and shifted students’ language from anthropomorphic to technological (Kuperman and Mioduser, 2012). Through an activity with the same constructionist programming environment, it was shown 5- and 7-year-old students’ conceptions of ToAM developed, and the students were able to better understand robots’ behavior (Spektor-Precel and Mioduser, 2015). A study with programming and ML training activities showed 4-6-year-old students’ understanding of ToM and perceptions of robots changed throughout the experiment (Williams et al., 2019). In this work, we investigate whether students’ ToAM and perceptions of AI in middle and high school change through a constructionist CA programming activity and workshop.
We conducted our workshops with 47 students separated into two groups of 12 and 35. For each group, the students’ teachers observed the workshops and provided feedback to the three teaching researchers (Van Brummelen et al., 2021). The teachers were recruited through an Amazon Future Engineers call to Title I schools. Each teacher chosen for the workshops was asked to recruit 5 or 6 of their students. We targeted Title I schools because they have high concentrations of children from low-income families (Skinner, 2019), and we wanted to provide opportunities for enrichment that they may not normally receive. We developed middle and high school level AI curriculum and thus targeted middle and high school students. The students’ mean age was 14.78 (range 11-18, SD=1.91), with 19 self-identifying as male, 27 self-identifying as female, and 1 student that did not complete the questionnaire.
3.2.1. Programming agents
3.2.2. Workshop outline
This section provides a brief overview of the learning intervention, which is described in-depth in (Van Brummelen et al., 2021). The intervention occurred over two sessions, which both involved five consecutive days of 2.5 hour long Zoom sessions. The first day began with an introduction to the MIT App Inventor interface (Wolber et al., 2015) to accustom students to block-based coding. Then the students were given a chance to interact freely with Alexa, writing down the questions they asked during the interaction. In the first week, students were each provided with a complimentary Echo Dot. This was not feasible for the second week of workshops due to an increased number of students, so students either used the Alexa app on their mobile devices, an online Alexa simulator (within MIT App Inventor or otherwise), or Alexa devices they previously owned. Overall, 19 students used an Alexa device, 17 used the Alexa app, 10 used an online simulator, and one did not specify.
The second day involved introducing students to key AI and conversational AI concepts, discussing AI ethics, and completing a tutorial walk-through to create an Alexa skill that would respond to basic greetings. On the third day, students completed a tutorial to develop a calculator skill, in which Alexa could be asked, “What’s number A multiplied by number B
”, or something similar. Next, we taught students about ML in more depth, including discussing the difference between a rule-based CA developed on the first day and the ML-based CAs developed on the second and third days. Finally, students engaged in an AI text generation activity.
On the fourth day, students developed a skill that enabled Alexa to read out text entered into MIT-App-Inventor-developed mobile apps. Students then brainstormed ideas for skills for their personal projects. Students spent the final day developing their projects and presenting them to the rest of the class.
Various questionnaires inspired by the perception of AI questions in (Druga et al., 2018) and (Lovato et al., 2019) were given to students during the learning intervention. On the first day, students recorded their interactions with Alexa, impressions of the CA, and demographics information. At the start of the second day, students completed a questionnaire assessing their initial feeling towards and understanding of Alexa, AI and conversational AI. The questions were divided into two sets, which we refer to as the Persona and Conception questions.
The Persona questions assessed students’ sentiments about Alexa on a 7-point Likert scale. The questions stated, “Alexa is…” followed by “intelligent”, “friendly”, “alive”, “safe”, “trustworthy”, “human-like”, and “smarter than me”. The final Persona question asked how close students felt to Alexa using the Inclusion of the Other in the Self scale (Gächter et al., 2015). The Conception questions assessed students’ understanding of AI and conversational AI through asking, “Describe in your own words what AI is” and “Describe in your own words what conversational AI is (e.g., chatbots, like Alexa or Google home, use conversational AI)”. At the end of the final day, students completed the Persona and Conception questions again. Additional questionnaires were given at the end of the second, third, and fourth days to assess specific AI literacy competencies, as discussed and analyzed in (Van Brummelen et al., 2021).
3.4. Data Analysis
This study builds on the study presented in (Van Brummelen et al., 2021). Thus, certain data analyzed in this study (e.g., demographics) is necessarily the same; however, this study focuses on data not analyzed in (Van Brummelen et al., 2021), including the questionnaire responses to the Persona questions and students’ reported interactions with Alexa. The responses to the Conception questions were analyzed in both studies, however using different methods and through different lenses. This study investigates students’ conceptions of AI through a word frequency analysis as well as analyses of changes in number of tags (as described below). The study in (Van Brummelen et al., 2021) assessed students’ AI literacy before and after the learning intervention.
To investigate the responses to qualitative questions, a reflexive, open-coding approach to thematic analysis (Braun et al., 2019) was performed by three researchers. The three researchers independently completed familiarization and code-generation stages. After several discussions, the three researchers came to a consensus on codes for the questionnaire responses. Codes and respective representative quotations can be found in (Van Brummelen et al., 2020). Researchers generally constructed codes inductively or with respect to ideas from literature, including the Big AI Ideas (Touretzky et al., 2019). It is important to note that responses often involved multiple ideas and were thus tagged with more than one code.
For the quantitative questions (e.g., Likert scale Persona questions) asked on both pre- and post-questionnaires, the Wilcoxon Signed-Rank Test was employed to measure changes. Additionally, we used the Kendall Tau method to create pairwise correlation matrices. We analyzed the correlation coefficients using Cohen ((2013))’s definition for correlation effect strength for behavioral and education psychology (Cohen, 2013). To test the validity of the strength of the coefficients, we compared Kendall Tau p-values to an alpha of 0.05.
For the word frequency analysis, we used the NLTK library (Loper and Bird, 2002) to remove stop-words, tokenize and lemmatize qualitative responses. Additionally, to better visualize non-obvious concepts, we filtered out words directly from the questions, including ‘AI’, ‘artificial’, ‘intelligence’, and ‘conversational’. Word clouds were generated using (Mueller, 2020).
4.1. Student interactions with Alexa
To understand the types of interactions students had with Alexa prior to the intervention, we coded the phrases they reported saying to Alexa during the interaction activity. We found most of the phrases fell into one of five categories listed in Tab. 1. The Information Updates category involved real-time events; the Action Commands category involved built-in Alexa applications; the Personal Questions category involved questions about Alexa; the Jokes category involved asking Alexa to say a joke; and the Other category involved questions and phrases that were often humorous (e.g., “Are dragons real?”) or impossible to fully answer (e.g., “What are all the numbers of pi?”), or generally fell outside of the other categories (e.g., “Hello”). Note that prior to the activity, we asked Alexa to tell us a joke, which may have contributed to a large number of students also asking Alexa for jokes.
|Information updates||What time is it?, How is the weather for Wednesday?, How is the traffic?||31 (26%)|
|Action commands||Set a 15-minute timer, Play my Custom Spotify Playlist, Remind me that I have a meeting at 1:00 pm, What’s 0 times 0?||30 (25%)|
|Other||Hello, Learn my voice, Are dragons real?, What are all the numbers of pi?||24 (20%)|
|Jokes||Tell me a joke, Can you tell me a joke?||17 (14%)|
|Personal questions||What’s your favorite color?, When were you made?, What’s your favorite video game?, How was your day?||16 (14%)|
4.2. Perceptions of Alexa pre- and post-workshop
By comparing pre- and post-survey answers to the Persona questions (see Fig. 2), we found significant differences in how students felt about Alexa’s intelligence and how close they felt they were to Alexa. After the intervention, students felt Alexa was more intelligent (, , , ) and felt closer to Alexa (, , , ). We did not find any evidence of significant differences in how students felt about Alexa being friendly, alive, safe, trustworthy, human-like or smarter than themselves before and after the intervention.
Prior to the intervention, students generally reported Alexa as being highly intelligent (, ), highly friendly (, ), not very alive (, ), highly safe (, ), moderately to highly trustworthy (, ), and moderately human-like (, ). They also reported feeling Alexa was much smarter than themselves (, ), and feeling not particularly close to Alexa (, ). The results were similar after the intervention (other than the changes in intelligence and closeness described above).
4.3. Correlations between perceptions of Alexa
We found strong ( (Cohen, 2013)) correlations between student reports of Alexa’s safeness and trustworthiness on both the pre- and post-test, as well as between Alexa’s friendliness and trustworthiness on the post-test. There was also a strong correlation between trustworthiness reported on the pre-test and safeness reported on the post-test. Student reports of Alexa’s friendliness and trustworthiness on the pre-test and between the pre- and post-tests were moderately ( (Cohen, 2013)) correlated.
Other moderate correlations included student reports of Alexa’s intelligence and trustworthiness, friendliness and safeness, trustworthiness and feelings of closeness, human-likeness and aliveness, human-likeness and feelings of closeness, as well as aliveness and feelings of closeness. In the post-test, student reports of Alexa’s intelligence and feeling Alexa was smarter than them, as well as Alexa’s trustworthiness and feeling Alexa was smarter than them were moderately correlated. Additionally, there was a moderate correlation between students with more experience programming prior to the intervention and reports of Alexa’s human-likeness on both the pre- and post-test. Our full correlation analysis is shown in Fig. 3.
4.4. Student diction when describing AI
To visualize students’ understanding of AI and conversational AI, we analyzed word frequency and created word clouds based on answers to two questions. Fig. 4 shows the word frequency analyses of students’ answers to, “Describe in your own words what AI is”, prior to and after the intervention. Fig. 5 shows the analyses of answers to, “Describe in your own words what conversational AI is (e.g., chatbots, like Alexa or Google Home, use conversational AI)”.
4.5. Conceptions of AI and conversational AI
To better understand students’ qualitative answers when conceptualizing AI and conversational AI, we performed a graphical exploration of tag frequency from our thematic analysis. The graph in Fig. 6 shows the change in tag frequency from pre- to post-test. Since the number of participants who completed the pre-test differed from the post-test, the number of tags ( and ) were normalized over the number of responses ( and ), and reported as normalized percent change, (Eq. 1). This is presented as an exploratory, graphical analysis for high-level insights rather than statistical analysis.
5.1. Perceptions of Alexa’s persona
Prior to the study, we hypothesized students would feel Alexa was less intelligent after learning how to program it, as they would better understand how it works; however, students felt Alexa was more intelligent after the intervention (, ). This could have been for multiple reasons. Perhaps by successfully learning fundamental AI literacy concepts (Van Brummelen et al., 2021), students realized Alexa was more complex than they initially thought and thus perceived it to be more “intelligent” (as in the Dunning-Kruger effect (Dunning, 2011)). This is supported by the relative increase in AI literacy concepts (which are comparatively complex) in the post-test responses to the Conception questions (Fig. 6), and the relative decrease in pre-programming concepts (which are comparatively simplistic). Students also generally felt Alexa was smarter than themselves (both before and after the intervention). This is consistent with previous studies of students aged 3-10 (Druga et al., 2017; Lovato et al., 2019).
The Dunning-Kruger concept may also explain why there were relatively fewer tags identified—likely indicating fewer ideas presented by students—in the post-test than in the pre-test for many of the qualitative answers to the Conception questions. For example, as shown in Fig. 6, there were relatively fewer tags for the majority of the tag categories in the post-test responses about conversational AI. Perhaps students became “less ignorant of their ignorance” (Dunning, 2011) about Alexa through the intervention, and therefore felt less qualified to answer the qualitative questions and thus presented fewer ideas in the post-test. Nevertheless, one limitation of this study was that students responded to the post-test at the end of the workshops, so they may have had less energy than when they responded to the pre-test, alternatively explaining the relatively fewer ideas presented.
We also hypothesized that students would personify Alexa less after understanding the logic behind how it works, and therefore rate its “aliveness”, “human-likeness”, “friendliness”, and their feelings of closeness to it as less than prior to the intervention. However, there was no significant evidence for any change, except that they felt closer to Alexa (, ) after the intervention. Students’ increased feelings of closeness could be due to “boundary dissolution”, which is a type of closeness where two agents (usually human) no longer function completely autonomously, but rather function dependently (Kreilkamp,Thomas, 1984). In this case, an apparent “boundary dissolution” due to Alexa initially seeming to function independently, but seeming to function dependently on students’ programming efforts after the intervention, could have caused students’ increased feelings of closeness.
Alternatively, perhaps having programming experience fundamentally increases feelings of closeness to Alexa, seeing as students’ prior programming experience was moderately correlated with closeness. Furthermore, prior programming experience and human-likeness, as well as closeness and human-likeness were moderately correlated. One explanation could be that as students learned to program, they felt Alexa had human-like, logical reasoning, and thus felt closer to it (because of its human-like traits).
Students’ perceptions of Alexa’s friendliness and trustworthiness were strongly correlated, as well as trustworthiness and safeness, and to a lesser extent, intelligence and trustworthiness, friendliness and safeness, and closeness and trustworthiness. Although these correlations do not necessitate causation, it is important to consider the implications of potential causation when designing CAs. For instance, if a CA was purposefully designed to seem friendly and intelligent, users may associate this with trustworthiness and safeness, despite the potential for the CA to provide incorrect information (intentionally or not). Nevertheless, this could also provide positive opportunities, including how students may learn better if they feel a pedagogical agent is friendly and intelligent, and thus also trustworthy and safe. This is discussed in more depth below.
5.2. Conceptions of AI and conversational AI
From the pre-/post-test comparison of word frequency in responses describing AI (Fig. 4) and conversational AI (Fig. 5), as well as the change in tag frequency analysis (Fig. 6), students’ conceptions seemed to shift towards more accurate understandings. For instance, the diction for describing AI seemed to shift towards computer-science-related terminology, including program, learn, and information. This trend is consistent with other literature, in which students describe AI with more computer science vocabulary after developing AI projects (Rodríguez-García et al., 2021). Furthermore, the emergence of the word learn in post-test responses suggests a better understanding of AI systems’ ability to adapt and update with training. For instance, one student’s response described AI as “a program that learns and uses the learning for other problems”. Furthermore, as shown in Fig. 6, there was a relative increase in references to concepts from the Big AI Ideas (Touretzky et al., 2019), including Learning and Representation and reasoning, and a relative decrease in simplistic explanations of the AI acronym (e.g., “AI is artificial intelligence”) and of how AI is “like a human”, indicating better understanding.
In the conversational AI responses (Fig. 5), human remains the most frequent word in the pre- and post-test, suggesting student understanding of how human interaction is central to conversational AI’s purpose. For the conversational AI descriptions shown in Fig. 6, Learning and the concept of natural language (NL) responses and understanding increased, indicating better understanding. Furthermore, there was a relative decrease in simplistic explanations of conversational AI being “like a robot”, mimicking humans, having pre-programmed responses, and being something that “help[s] humans”. Despite these indications of better understanding, there was a slight relative increase in vague or shallow answers for both the descriptions of AI and conversational AI, and a relative decrease in the Big AI idea of representation and reasoning for conversational AI. Overall, however, it seemed as if students’ conceptions improved through the workshops, especially considering the evidence for increased understanding of AI literacy concepts presented in (Van Brummelen et al., 2021).
5.3. Design Considerations
Based on the results, we present design considerations for engaging students in learning experiences with CAs.
As shown in Tab. 1, students asked Alexa many personal questions (e.g., “Alexa, do you like Siri?” and “What’s your favorite color?”), which would typically be asked of humans rather than computer systems. Alexa’s often humorous responses (e.g., “I like ultraviolet. It glows with everything”) could have contributed to students’ perception of personified traits, like friendliness, intelligence and trustworthiness, which were all rated highly. As discussed, personified traits in CAs could play a role in effective teaching interventions (Schöbel et al., 2019), especially since feelings of closeness and trust can enhance human teaching and learning experiences (Birch and Ladd, 1997; Wolter et al., 2014; Al-Yagon and Mikulincer, 2004).
We recommend pedagogical CA developers cautiously consider personification in their designs. Although personification could engage students in effective learning experiences, it could also increase their feelings of trust disproportionately with the actual trustworthiness of the device. For example, students could perceive the device as always providing unbiased, correct answers, despite AI systems often being biased (Roselli et al., 2019). Thus, we further recommend considering transparency in CA design.
Students also seemed to test the limits of Alexa, asking impossible or difficult questions as encapsulated by the Other category in Tab. 1. For example, students asked Alexa to turn itself off, to tell them all the (infinite) digits of , and to provide the answer to . These behaviors could be linked to trying to understand the system’s inner workings. Thus, we recommend developing CAs with the ability to explain themselves, and furthermore, provide transparency in terms of their abilities (e.g., being able to explain AI bias). This is especially important when considering the correlations between CAs’ friendliness and perceived trustworthiness, and students’ potential increase in awareness of ignorance in how CAs work, as discussed above. This recommendation also aligns with other child-CA interaction research, which suggests designing transparent AI systems with respect to children’s level of understanding (Williams et al., 2019).
Similar to the behavior of “testing” Alexa described above, students asked Alexa playful questions like, “How much wood would a wood chuck chuck if a wood chuck would chuck wood?” and “Are dragons real?”. These questions illustrate students’—even middle and high school students’—innate desire to play. Play can be hugely beneficial in learning environments, especially from a constructionist perspective (Papert and Harel, 1991; Rice, 2009); thus, we recommend considering playful learning experiences when developing CAs. For example, in our study students had the opportunity to develop their own CA projects. Students came up with many different playful (as well as serious) ideas (Van Brummelen et al., 2021). One very playful idea included a CA “Meme Maker”, which according to the developer, “help[ed] everyone get a quick laugh because as the old saying goes laughter is the best medicine”. This same student cited their favorite part of the workshop as “improving [their] coding ability and learning more about [CAs]”.
Many student projects’ purposes were to provide utility, with 34% being mental and physical health-, 29% being educational-, 21% being productivity- and 8% being accessibility-related CAs (Van Brummelen et al., 2021). Utility was also reflected in students’ interactions with Alexa, as Information updates and Action commands were the most common interactions reported. With students evidently being interested in CAs’ utility, we recommend designing CAs with useful features
to provide entry points to CA engagement and potential learning moments. For example, students might naturally engage with a CA in figuring out what the weather is like tomorrow, which would provide an opportunity to teach students about APIs and databases, andhow CAs provide such answers.
6. Limitations and Future Work
One limitation of this study includes its generalizability. We engaged middle and high school students in remote workshops in which they used MIT App Inventor to program Amazon Alexa; however, the results may not generalize to other environments or grade bands. Furthermore, since we held workshops on two different weeks with slight differences, this could have affected the results. Thus, future work may include larger follow-up studies with students in different grade bands in different environments.
There are also limitations associated with thematic analyses. For instance, we may have missed certain themes within the data, despite following the approach to analysis described in (Braun et al., 2019). Furthermore, the amount of ideas presented by students in the pre- versus post-tests could have been influenced by the time of day each test was presented. Nevertheless, we believe the thematic data (as well as the word frequency data) are useful for exploratory, graphical analysis. Further research should statistically analyse students’ conceptions of CAs and investigate how these conceptions affect the effectiveness of learning interventions.
Through the programming and learning intervention, students’ perceptions of Alexa changed in how they viewed its intelligence and how close they felt to it, and students’ conceptions tended towards describing AI systems using more computer science terminology and AI literacy concepts. Based on these results, we presented four design recommendations, including considering personification, transparency, playfulness and utility when designing CAs for engaging students in learning experiences. This study contributes to AI literacy research aiming to develop students’ understanding of AI to be more accurate and healthy, ToAM research aiming to understand students’ perception of AI, and CA research aiming to develop more useful, effective interactions.
8. Selection and Participation of Children
Children aged 11-18 (Mean=14.78, SD=1.91) were selected by their teachers to participate in the middle/high school workshops. Teachers were selected from those that responded to an Amazon Future Engineers call to Title I schools and signed a consent form. Selected students of the age of 18 were given similar student consent forms to sign, and those under the age of 18 were given assent forms and consent forms to be signed by their legal guardians before participating. The university’s IRB approved the study protocol and consent/assent forms, which communicated how the data would be aggregated and anonymized. Given the wide age range, teachers assigned some of their older students to be mentors to younger students in case they fell behind.
Acknowledgements.We thank the teachers and students, volunteer facilitators, MIT App Inventor team, Personal Robots Group, and Amazon Future Engineer (AFE) members who made the workshops possible. Special thanks to Hal Abelson and Hilah Barbot. This work was funded by the AFE program and Hong Kong Jockey Club Charities Trust.
- Are you convinced? a wizard of oz study to test emotional vs. rational persuasion strategies in dialogues. Computers in Human Behavior 57, pp. 75 – 81. External Links: Cited by: §1.
- Socioemotional and academic adjustment among children with learning disorders: the mediational role of attachment-based factors. The Journal of Special Education 38 (2), pp. 111–123. External Links: Cited by: §1, §5.3.1.
- Constructionism, ethics, and creativity: developing primary and middle school artificial intelligence education. In International Workshop on Education in Artificial Intelligence K-12 (EDUAI’19), Cited by: §2.2.
- Embodied conversational agents: effects on memory performance and anthropomorphisation. In Intelligent Virtual Agents, T. Rist, R. S. Aylett, D. Ballin, and J. Rickel (Eds.), Berlin, Heidelberg, pp. 315–319. External Links: Cited by: §1.
- The teacher-child relationship and children’s early school adjustment. Journal of school psychology 35 (1), pp. 61–79. Cited by: §1, §5.3.1.
- 40 tweets about parenting with today’s technology. Verizon Media. Note: https://www.huffingtonpost.ca/entry/parenting-alexa-siri-tweets_l_6008e784c5b62c0057c35100Accessed: 2021-01-27 Cited by: §1.
- Thematic analysis. In Handbook of research methods in health social sciences, P. Liamputtong (Ed.), Cited by: §3.4, §6.
- Statistical power analysis for the behavioral sciences. Elsevier Science. External Links: Cited by: §3.4, §4.3.
- Co-constructing intersubjectivity with artificial conversational agents: people are more likely to initiate repairs of misunderstandings with agents represented as human. Computers in Human Behavior 58, pp. 431 – 442. External Links: Cited by: §1.
- What is positive youth development?. The ANNALS of the American Academy of Political and Social Science 591 (1), pp. 13–24. External Links: Cited by: §1.
- The effects of multiple-pedagogical agents on learners’ academic success, motivation, and cognitive load. Computers & Education 111, pp. 74 – 100. External Links: Cited by: §2.2.
- Decoding design agendas: an ethical design activity for middle school students. In Proceedings of the Interaction Design and Children Conference, IDC ’20, New York, NY, USA, pp. 1–10. External Links: Cited by: §1.
- ”Hey google is it ok if i eat you?”: initial explorations in child-agent interaction. In Proceedings of the 2017 Conference on Interaction Design and Children, IDC ’17, New York, NY, USA, pp. 595–600. External Links: Cited by: §1, §1, §2.1, §5.1.
- How smart are the smart toys? children and parents’ agent interaction and intelligence attribution. In Proceedings of the 17th ACM Conference on Interaction Design and Children, IDC ’18, New York, NY, USA, pp. 231–240. External Links: Cited by: §1, §2.1, §3.3.
- Bibliography: students’ and teachers’ conceptions and science education. Diakses pada tanggal 26. Cited by: §1.
- Chapter five - the dunning–kruger effect: on being ignorant of one’s own ignorance. J. M. Olson and M. P. Zanna (Eds.), Advances in Experimental Social Psychology, Vol. 44, pp. 247 – 296. External Links: Cited by: §5.1, §5.1.
- Artificial intelligence & theory of mind. Ulm University, pp. 1–11. Cited by: §2.1.
- Measuring the closeness of relationships: a comprehensive evaluation of the ’inclusion of the other in the self’ scale. PloS one 10 (6), pp. e0129478. Cited by: §3.3.
- Do androids dream of electric copyright? comparative analysis of originality in artificial intelligence generated works. Intellectual property quarterly 2, pp. 20. Cited by: §2.1.
- Can ai artifacts influence human cognition? the effects of artificial autonomy in intelligent personal assistants. International Journal of Information Management 56, pp. 102250. External Links: Cited by: §1.
- The interrelationship between concepts about agency and students’ use of teachable-agent learning technology. Cognitive research: principles and implications 4 (1), pp. 1–20. Cited by: §1, §2.1, §2.1, §2.2.
- If asimo thinks, does roomba feel? the legal implications of attributing agency to technology. Journal of Human-Robot Interaction (Symposium on Robotics Law and Policy) 5 (3), pp. 23. External Links: Cited by: §1, §2.1, §2.1.
- “Robovie, you’ll have to go into the closet now”: children’s social and moral relationships with a humanoid robot.. Developmental psychology 48 (2), pp. 303. Cited by: §2.1.
- AI programming by children using snap! block programming in a developing country. edition, , Vol. 11082, , pp. . Cited by: §2.2.
- Psychological closeness: examples of closeness conceptualization references. The American Behavioral Scientist (pre-1986) 27 (6), pp. 771 (English). Note: Copyright - Copyright SAGE PUBLICATIONS, INC. Jul/Aug 1984; Last updated - 2019-11-23; CODEN - ABHSAU; SubjectsTermNotLitGenreText - New York External Links: Cited by: §5.1.
- [Chais] kindergarten children’s perceptions of “anthropomorphic artifacts” with adaptive behavior. Interdisciplinary Journal of E-Learning and Learning Objects 8 (1), pp. 137–147. External Links: Cited by: §2.2.
- Life on the road: exposing drivers’ tendency to anthropomorphise in-vehicle technology. In Proceedings of the 20th Congress of the International Ergonomics Association (IEA 2018), S. Bagnara, R. Tartaglia, S. Albolino, T. Alexander, and Y. Fujita (Eds.), Cham, pp. 3–12. External Links: Cited by: §1.
- Designing learning by teaching agents: the betty’s brain system. International Journal of Artificial Intelligence in Education 18 (3), pp. 181–208. Cited by: §2.2.
- A transition model for cognitions about agency. In 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI), Vol. , pp. 373–380. External Links: Cited by: §1.
- Does it “want” or “was it programmed to…”? kindergarten children’s explanations of an autonomous robot’s adaptive functioning. International Journal of Technology and Design Education 18 (4), pp. 337–359. Cited by: §2.2.
- Zhorai: designing a conversational agent for children to explore machine learning concepts. Proceedings of the AAAI Conference on Artificial Intelligence 34 (09), pp. 13381–13388. External Links: Cited by: §1.
- What is ai literacy? competencies and design considerations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, New York, NY, USA, pp. 1–16. External Links: Cited by: §1, §2.1, §2.2.
- Nltk: the natural language toolkit. arXiv preprint cs/0205028. Cited by: §3.4.
- Hey google, do unicorns exist? conversational agents as a path to answers to children’s questions. In Proceedings of the 18th ACM International Conference on Interaction Design and Children, IDC ’19, New York, NY, USA, pp. 301–313. External Links: Cited by: §1, §1, §2.1, §3.3, §5.1.
- Interacting with autonomous vehicles: learning from other domains. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, CHI EA ’18, New York, NY, USA, pp. 1–8. External Links: Cited by: §2.1.
- Supporting interest in science learning with a social robot. In Proceedings of the 18th ACM International Conference on Interaction Design and Children, IDC ’19, New York, NY, USA, pp. 71–82. External Links: Cited by: §1, §2.1.
- Making sense by building sense: kindergarten children’s construction and understanding of adaptive robot behaviors. International Journal of Computers for Mathematical Learning 15 (2), pp. 99–127. Cited by: §2.2.
- How to integrate emotions in dialogues with pedagogic conversational agents to teach programming to children. Innovative Perspectives on Interactive Communication Systems and Technologies, pp. 66. Cited by: §2.2.
- WordCloud for python documentation. Andreas Mueller. Note: http://amueller.github.io/word_cloud/Accessed: 2021-01-30 Cited by: §3.4.
- Situating constructionism. Constructionism 36 (2), pp. 1–11. Cited by: §1, §2.2, §5.3.3.
- An exploratory study on how children interact with pedagogic conversational agents. Behaviour & Information Technology 32 (9), pp. 955–964. External Links: Cited by: §2.2.
- Learning machine learning with personal data helps stakeholders ground advocacy arguments in model mechanics. In Proceedings of the 2020 ACM Conference on International Computing Education Research, ICER ’20, New York, NY, USA, pp. 67–78. External Links: Cited by: §1.
- Global conversational ai market forecast to 2024: integration of advanced ai capabilities adding value to the conversational ai offering. Research and Markets. Note: https://www.researchandmarkets.com/ Cited by: §1.
- Playful learning. Journal for Education in the Built Environment 4 (2), pp. 94–108. External Links: Cited by: §5.3.3.
- Evaluation of an online intervention to teach artificial intelligence with learningml to 10-16-year-old students. Cited by: §1, §5.2.
- Managing bias in ai. In Companion Proceedings of The 2019 World Wide Web Conference, WWW ’19, New York, NY, USA, pp. 539–544. External Links: Cited by: §5.3.1.
- Do computers have brains? what children believe about intelligent artifacts. British Journal of Developmental Psychology 13 (4), pp. 367–377. Cited by: §2.1.
- A configurational view on avatar design–the role of emotional attachment, satisfaction, and cognitive load in digital learning. In Fortieth International Conference on Information Systems, Munich, Cited by: §1, §5.3.1.
- The influence of conversational agents on socially desirable responding. In Proceedings of the 51st Hawaii International Conference on System Sciences, pp. 283. Cited by: §1.
- Impressions of computer and human agents after interaction: computer identity weakens power but not goodness impressions. International Journal of Human-Computer Studies 72 (10), pp. 747 – 756. External Links: Cited by: §1.
- A computational study of commonsense science: an exploration in the automated analysis of clinical interview data. Journal of the Learning Sciences 22 (4), pp. 600–638. External Links: Cited by: §1.
- The elementary and secondary education act (esea), as amended by the every student succeeds act (essa): a primer. crs report r45977, version 2.. Congressional Research Service. Cited by: §3.1.
- Mom busts 9-year-old son using alexa to cheat on homework. NYP Holdings. Note: https://nypost.com/2019/10/15/mom-busts-9-year-old-son-using-alexa-to-cheat-on-homework/Accessed: 2021-01-27 Cited by: §1.
- The influence of constructing robot’s behavior on the development of theory of mind (tom) and theory of artificial mind (toam) in young children. In Proceedings of the 14th International Conference on Interaction Design and Children, IDC ’15, New York, NY, USA, pp. 311–314. External Links: Cited by: §1, §2.1, §2.1, §2.2.
- ”Whom would you like to talk with?”: exploring conversational agents for children’s linguistic assessment. In Proceedings of the Interaction Design and Children Conference, IDC ’20, New York, NY, USA, pp. 262–272. External Links: Cited by: §1, §1.
- Envisioning ai for k-12: what should every child know about ai?. Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), pp. 9795–9799. External Links: Cited by: §1, §1, §2.1, §3.4, §5.2.
- Appendix. GitHub. Note: https://gist.github.com/jessvb/1cd959e32415a6ad4389761c49b54bbfAccessed: 2020-09-09 Cited by: §3.4.
- Teaching tech to talk: k-12 conversational artificial intelligence literacy curriculum and development tools. In 2021 AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI), Online. Cited by: §1, §1, §2.2, §2.2, §3.1, §3.2.1, §3.2.2, §3.3, §3.4, §5.1, §5.2, §5.3.3, §5.3.4.
- Tools to create and democratize conversational artificial intelligence. Master’s Thesis, Massachusetts Institute of Technology, Cambridge, MA. Cited by: §3.2.1.
- SmileyCluster: supporting accessible machine learning in k-12 scientific discovery. In Proceedings of the Interaction Design and Children Conference, IDC ’20, New York, NY, USA, pp. 23–35. External Links: Cited by: §1.
- A is for artificial intelligence: the impact of artificial intelligence activities on young children’s perceptions of robots. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI ’19, New York, NY, USA, pp. 1–11. External Links: Cited by: §1, §1, §2.1, §2.2, §5.3.2.
- Democratizing computing with App Inventor. GetMobile: Mobile Computing and Communications 18 (4), pp. 53–58. Cited by: §3.2.2.
- Gender-typicality of activity offerings and child–teacher relationship closeness in german “kindergarten”. influences on the development of spelling competence as an indicator of early basic literacy in boys and girls. Learning and Individual Differences 31, pp. 59 – 65. External Links: Cited by: §1, §5.3.1.
- Artificial intelligence is a house divided: A decades-old rivalry has riven the field. it’s time to move on.. The Chronicle. Cited by: §2.1.
- What are you talking to?: understanding children’s perceptions of conversational agents. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, CHI ’20, New York, NY, USA, pp. 1–13. External Links: Cited by: §1, §2.1.
- Youth making machine learning models for gesture-controlled interactive media. In Proceedings of the Interaction Design and Children Conference, IDC ’20, New York, NY, USA, pp. 63–74. External Links: Cited by: §1, §2.2.