We consider the general scenario in which a human and an artificially intelligent agent (AI) are collaborating to jointly complete a task. Depending on the domain, it is often true that the capabilities of the human and the robot may greatly differ. The AI agent may often possess superior abilities in terms reasoning, whereas the human agent may have superior perceptual and control abilities. An emerging area of research looks at the development of frameworks that would enable effective combined performance by leveraging the strengths of both parties, while ensuring human comfort. Our insight is that to achieve this goal, the AI needs to reason about the decision making of its human counterpart, and even intervene to guide their behavior towards efficiency. In many cases, this intervention may not be realized physically due to hardware or software limitations. In such cases, it is critical that the AI conveys its intentions implicitly through its actions.
In this work, we consider a scenario of collaborative packing, in which a human places a set of objects in a container, under the assistance of a recommender system. This application is of particular relevance these days, given the increasing presence of AI systems in logistics. Achieving adequate spatial efficiency in packing is an anecdotally hard problem for non-expert humans, whereas manipulation is still a big challenge for robots. On the other hand, AI systems feature superior planning capabilities, whereas humans are equipped with unparalleled manipulation capabilities. An efficiently combined collaborative effort, leveraging the strengths of both could help achieve increased overall performance.
As a first step to approach the outlined vision, we seek to understand the domain of packing, focusing on the strategies that humans employ when faced with completing a packing task. To do so, we conducted an online user study in which we asked human subjects to complete a series of packing tasks in a virtual environment. Each task involved the placement of a different set of 2-dimensional objects inside a packing container. Our findings suggest that human packing strategies in this domain can largely be classified into a set of distinct categories corresponding to different spatiotemporal patterns of placement. We discuss our findings and the ongoing development of a planning-under-uncertainty framework targeted towards ensuring improved efficiency and low cognitive load for humans in collaborative packing scenarios.
The concept of human-AI teaming is gaining popularity, as combining the strengths of humans and AI systems opens promising avenues for a variety of fields and applications [16, 21, 3, 17]. Naturally, the problem of enabling seamless, natural, and efficient collaboration in human-AI teams has received considerable attention over the recent years, with researchers focusing on different aspects of the interaction, such as the powerful communicative impact of actions performed in a shared context  or the tradeoffs between performance gains and compatibility with existing human mental models .
For a series of applications, transferring the benefits of human-AI teaming in the physical environment implies embodiment in robot platforms. Human-robot teaming has a unique potential for a variety of applications, given that robots can be both intelligent and physically capable. Thus the combination of their capabilities with those of humans may result in performance standards that neither party could otherwise achieve in isolation . Examples include fast task completion in sequential manipulation tasks , improved performance through intelligent resource allocation to human participants  etc.
A common complication in such applications is that explicit communication between the human and the robot is often not feasible, effective or desired. Therefore, in order to be of assistance, the robot needs to infer the intentions of the user implicitly, through observation of their actions, and clearly communicate its own intentions, through its own actions . A typical paradigm of particular relevance in this domain is shared autonomy, in which a robot assists a human user in completing their task. In a variety of applications, it has been shown that inferring and adapting to human intentions is positively perceived by users and effective [5, 20, 9, 13]. Furthermore, understanding the mechanisms underlying human decision making in a particular domain is shown to yield performance improvements and positive impressions in such joint tasks [26, 25]. Finally, explicitly collaborative tasks such as collaborative manipulation [6, 24], and assembly  or implicitly collaborative tasks such as social navigation  benefit significantly by the incorporation of models of human inference.
In this work, we consider a joint task (packing), performed in collaboration between a human and an AI agent. We also consider a setting of implicit communication, in the sense that human intentions are not directly observed and need to be inferred. Our first step towards approaching this scenario is to understand the domain by collecting and analyzing human data.
Our study was conducted online on an interactive web application. Participants were recruited online, through the Amazon Mechanical Turk platform . Each participant was assigned the same set of 65 packing tasks, presented in a random order. These tasks involved the placement of sets of 4-8 rectangular objects of different sizes inside a rectangular container of fixed size.
The web interface depicts a set of rectangular objects of various dimensions, alongside a rectangular container, from a top view (see Fig. 1). Participants were instructed to sequentially place all of these objects at locations of their choice, inside the container. Once an object is placed inside the container, it cannot be moved – thus participants are forced to judiciously decide on the placements of their objects. The interface comprises three buttons: (a) a button for proceeding to the next task (shown at the bottom left); (b) a reset button (middle), useful in cases where participants’ decisions did not allow them to put all object to the container; (c) a button which allowed participants to proceed to the next task without completing the current one (right). Since at this stage we were interested in understanding the domain of packing rather than participants’ performance, we gave users the option of resetting a task to its initial state by hitting the ”Reset” button. We also gave participants the option to skip a task if they decided to but we disincentivized this option by placing a sad smiley face on the corresponding button. Similarly, we incentivized completion by placing a happy face on the ”Done” button.
Generation of Packing Tasks
The complexity of a packing task depends on the relationships among the sizes of the objects, and the size of the container. In practice, these relationships yield different tolerance requirements in the object placements. The smaller the tolerances, the higher the amount of precision required to ensure a collision-free placement, and thus the more complex the packing task.
While we can generate arbitrarily easy or complex packing tasks by designing a large container and many small items that could be packed in many different ways, such problem instances would not help us observe any discernible patterns that humans may naturally have. In order to identify spatiotemporal patterns in packing, we have designed our packing tasks such that each task satisfies the following conditions:
At least 70% of the container is filled with the items.
There are finite clusters of spatially feasible solutions.
By committing to these conditions, we constrain the tasks to be doable with a finite number of qualitatively equivalent object placements. Our expectation was that even under the constrained setting of finite spatially feasible solutions, innate human packing styles and preferences would still manifest themselves. In particular, we expected that human packing strategies would show strong inclinations towards distinct classes of spatiotemporal placements.
In order to generate packing tasks that satisfy the above conditions, we fix the size of the container, randomly generate items of various sizes, and then attempt to place them. For any resulting placement, if more than 70% of the container is filled with 4-8 items, then we test if the second condition is met. To do so, we empty the container and attempt to place the same items in different ways. If we can generate more than 50 different collision-free configurations, then we run a Principal Components Analysis (PCA)
on the configurations. Each configuration is represented with a vector, stacking the Cartesian coordinates of all items. We take the first two dimensions of the PCA projection, visually check for discernible clusters as in Fig. 1(b), and keep the task only if such clusters are found. In total, we have generated 65 tasks of varying complexity, corresponding to placements of 4-8 objects.
Dataset & Analysis
In total, we had 100 participants, recruited through the Amazon Mechanical Turk platform. The participants were between 18 and 65 years old. Each participant was given the 65 tasks in a randomly generated order and was asked to complete them within an hour. On average, the participants took about 40 minutes to complete the tasks. For each task, we recorded the ordering and object placement locations inside the container. The collected dataset is grouped per task. For each task: (a) we cluster the provided solutions with respect to their spatial patterns using PCA (see Fig. 1(b)
); (b) we classify the provided solutions into a set of classes by looking at the first two item placements and comparing the frequency of each ordered pair (see Fig.1(c)).
The findings of this study illustrate our extracted knowledge about the particular domain in consideration, i.e., that of 2-dimensional packing. We discovered that human packing strategies in this domain tend to follow specific spatiotemporal patterns. Fig. 2 depicts an example packing task, completed by a participant. Fig. 1(b) illustrates four distinct spatial clusters of object placements that emerged in subjects’ placements. For the same task, Fig. 1(c) describes the frequency of different temporal patterns that emerged.
Overall, qualitative examination of the discovered patterns revealed interesting trends, such as the placement of larger objects in the beginning of the task, and the placement of larger objects at the corners (see Fig. 2). Despite the existence of such trends, subjects exhibited a variety of different strategies. Identifying and adapting to observed packing strategies online could enable an artificial agent to assist a human agent in an effective fashion.
Ongoing & Planned Work
Our key insight is that understanding the mechanisms underlying human decision making could enable an artificial agent to provide effective assistance, yielding improved task performance and reduced cognitive load for human users. Some domains can be particularly challenging for humans, for reasons related to the limits of human computational abilities. For example, in the packing domain, the limited human planning horizon and human spatial efficiency can greatly affect task performance and mentally load humans to an undesired extent. In fact, packing can be cast as the knapsack optimization problem, which is known to be NP-hard . We expect that an AI agent that understands both the knapsack problem and the mechanisms underlying human decision making could result in effective assistance yielding improved task performance and reduced cognitive load for human users. Ongoing work involves the development of a planning framework that would allow us to test this insight through a follow-up user study.
A Framework for Packing Assistance
We are currently incorporating the findings of the presented study to develop a framework for planning under uncertainty in collaborative packing tasks. In particular, we are working on adapting the Bayesian Reinforcement Learning (BRL) framework of lee2018bayesian lee2018bayesian to reason about uncertainty over human packing strategies. BRL is a reinforcement learning framework that incorporates a mechanism for reasoning about model uncertainty. It models the problem as a Bayes-Adaptive Markov Decision Process (BAMDP), explicitly modeling uncertainty as a belief over a latent uncertainty variable, incorporated in the transition function and the reward function. Overall, BRL maximizes the expected discounted reward, given the uncertainty. We believe that this mechanism is of particular relevance and value in problems involving human interaction, where uncertainty is typically over human mental models underlying their decision making.
For our task domain, we are incorporating a belief distribution over the human user’s spatiotemporal placement strategy, given the container configuration and the object’s shape. We plan on using the collected human dataset to learn the outlined predictive model. During execution, we will be using our framework as a recommender system that will be providing online recommendations to the human user.
Planned User Study
To formally investigate our outlined insight, we design an online user study, in which human subjects are exposed to a set of conditions (within-subjects), corresponding to different modes of AI assistance. More specifically, we consider the following set of conditions:
No recommendation – the user completes the task without receiving any assistance.
The system provides object recommendations, i.e., assists by manipulating the order of object placements.
The system provides both order and placement recommendations.
The system provides random object recommendations.
The system provides random order and random placement recommendations.
We hypothesize that the assistive conditions will yield improved task performance compared to the condition of no assistance, but also more positive human ratings and reduced reported cognitive load. As performance metrics, we consider the time-to-completion and increased spatial efficiency. After each condition, we will collect ratings of perceived system intelligence, likeability, and predictability, based on the Godspeed  to understand the perception of the considered conditions from the perspective of participants. Finally, we will measure the cognitive load associated with each condition by presenting a questionnaire based on the NASA-TLX . Finally, participants will be provided with an open-form question, asking them to share qualitative feedback of their choice regarding their interaction with the system.
Gilwoo Lee is partially supported by Kwanjeong Educational Foundation. This work was partially funded by the Honda Research Institute USA, the National Science Foundation NRI (award IIS-1748582), and Robotics Collaborative Technology Alliance (RCTA) of the United States Army Laboratory.
-  (2019) Updates in human-ai teams: understanding and addressing the performance/compatibility tradeoff. In Proceedings of the AAAI Conference on Artificial Intelligence, Cited by: Related work.
-  (2009) Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International Journal of Social Robotics 1 (1), pp. 71–81. Cited by: Planned User Study.
-  (2014-10) Data-driven decisions for reducing readmissions for heart failure: general methodology and case study. PLOS ONE 9 (10), pp. 1–9. Cited by: Related work.
-  (2011) Amazon’s mechanical turk: a new source of inexpensive, yet high-quality, data?. Perspectives on Psychological Science 6 (1), pp. 3–5. Cited by: Study design.
-  (2013) A policy-blending formalism for shared control. The International Journal of Robotics Research 32 (7), pp. 790–805. Cited by: Related work.
-  (2014) Integrating human observer inferences into robot motion planning. Autonomous Robots 37 (4), pp. 351–368. Cited by: Related work.
-  (2002) Optimal learning: computational procedures for bayes-adaptive markov decision processes. Ph.D. Thesis, University of Massachusetts Amherst. Cited by: A Framework for Packing Assistance.
-  (2002) Computers and intractability. Vol. 29, wh freeman New York. Cited by: Ongoing & Planned Work.
-  (2017) Human-in-the-loop optimization of shared autonomy in assistive robotics. IEEE Robotics and Automation Letters 2 (1), pp. 247–254. Cited by: Related work.
-  (1988) Development of nasa-tlx (task load index): results of empirical and theoretical research. In Human Mental Workload, P. A. Hancock and N. Meshkati (Eds.), Advances in Psychology, Vol. 52, pp. 139 – 183. Cited by: Planned User Study.
-  (2015) Effective robot teammate behaviors for supporting sequential manipulation tasks. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6374–6380. Cited by: Related work.
-  (2004) Collaboration in human-robot teams. In Proceedings of the AIAA Intelligent Systems Technical Conference, pp. 6434. Cited by: Related work.
-  (2018) Shared autonomy via hindsight optimization for teleoperation and teaming. The International Journal of Robotics Research 37 (7), pp. 717–742. Cited by: Related work.
-  (1986) Principal component analysis. Springer Verlag. Cited by: Generation of Packing Tasks.
-  (2018) Robot assisted tower construction-a resource distribution task to study human-robot collaboration and interaction with groups of people. arXiv preprint arXiv:1812.09548. Cited by: Related work.
-  (2012) Combining human and machine intelligence in large-scale crowdsourcing. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 467–474. Cited by: Related work.
-  (2016) Directions in hybrid intelligence: complementing ai systems with human intelligence. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 4070–4073. Cited by: Related work.
-  (2017) Implicit communication in a joint action. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 283–292. Cited by: Related work.
-  (2015) Recovering from failure by asking for help. Autonomous Robots 39 (3), pp. 347–362. Cited by: Related work.
-  (2014) Online generation of homotopically distinct navigation paths. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 6462–6467. Cited by: Related work.
-  (2012) Real-time collaborative planning with the crowd. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2435–2436. Cited by: Related work.
-  (2019) Implicit communication of actionable information in human-ai teams. In Proceedings of the CHI Conference on Human Factors in Computing Systems, pp. 95:1–95:13. Cited by: Related work.
-  (2019) Effects of distinct robot navigation strategies on human behavior in a crowded environment. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 421–430. External Links: Cited by: Related work.
-  (2017) Human-robot mutual adaptation in collaborative tasks: models and experiments. The International Journal of Robotics Research 36 (5-7), pp. 618–634. Cited by: Related work.
-  (2015) Efficient model learning from joint-action demonstrations for human-robot collaborative tasks. In Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI), pp. 189–196. Cited by: Related work.
-  (2013) Human-robot cross-training: computational formulation, modeling and evaluation of a human team training strategy. In Proceedings of the ACM/IEEE International Conference on Human-robot Interaction (HRI), pp. 33–40. Cited by: Related work.