DiOS – An Extended Reality Operating System for the Metaverse

Driven by the recent improvements in device and networks capabilities, Extended Reality (XR) is becoming more pervasive; industry and academia alike envision ambitious projects such as the metaverse. However, XR is still limited by the current architecture of mobile systems. This paper makes the case for an XR-specific operating system (XROS). Such an XROS integrates hardware-support, computer vision algorithms, and XR-specific networking as the primitives supporting XR technology. These primitives represent the physical-digital world as a single shared resource among applications. Such an XROS allows for the development of coherent and system-wide interaction and display methods, systematic privacy preservation on sensor data, and performance improvement while simplifying application development.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

06/03/2018

Elasticizing Linux via Joint Disaggregation of Memory and Computation

In this paper, we propose a set of operating system primitives which pro...
08/25/2020

HoloLens 2 Research Mode as a Tool for Computer Vision Research

Mixed reality headsets, such as the Microsoft HoloLens 2, are powerful s...
02/23/2022

From Digital Media to Empathic Reality: A Systematic Review of Empathy Research in Extended Reality Environments

Recent advances in extended reality (XR) technologies have enabled new a...
06/16/2021

Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

Mobile Augmented Reality (MAR) integrates computer-generated virtual obj...
03/15/2021

Network in Disaggregated Datacenters

Nowadays, datacenters lean on a computer-centric approach based on monol...
12/29/2013

Implementation of Hand Detection based Techniques for Human Computer Interaction

The computer industry is developing at a fast pace. With this developmen...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Extended Reality (XR) and its applications are becoming increasingly pervasive. With the continually improving device hardware and the upgraded network capabilities brought by 5G, companies envision the development of large-scale XR environments. From the Niantic planet-scale AR alliance111https://nianticlabs.com/blog/niantic-planet-scale-ar-alliance-5g to Facebook’s metaverse, major industry actors envision the future of XR as pervasive, shared, and permanent environments where digital and physical are closely interleaved (Lee et al., 2021).

However, XR applications are currently far from such vision. XR presents tight performance constraints, with a minimum of 60 FPS at high (4K) resolution and a motion-to-photon latency below 20 ms. Current systems cannot achieve such performance, primarily owing to their architecture. Most platforms only allocate resources (e.g., camera) to a single application at a time, and handle XR processing in a compartmentalised fashion. As such, the digital reconstitution of the physical world, the content placement strategies, and the user interactions are unique to each application instance. Such strategy result in redundant development effort, limited performance, and prevents the integration of diverse applications within a single blended digital-physical world.

(a) Potential architecture of an XROS.
(b) Data flows between XROS’s main modules.
Figure 1. XROS Architecture, major building blocks, and data flows. Data coming from sensors is preprocessed by specialised chips and filtered by the privacy protection module before reaching the main building blocks of XR. These blocks provide the primitives for the XR application to execute in a standardised fashion.

In this paper, we argue that the current model of application- or middleware- level XR is significantly impeding the development of XR applications, not to mention a metaverse. We propose for the enablers of XR to integrate the Operating System (OS) level. DiOS, our proposed architecture for an XROS, would integrate and expose XR fundamentals to provide (1) better performance by enabling fine-grained control of XR operation executions, (2) better integration between XR applications and the OS, and between XR applications, by considering the digital-physical XR world as a shared resource, (3) easier development of XR applications through standardised software primitives, (4) increased privacy and security by applying systematic privacy protection mechanisms on sensor data, and (5) more intuitive and coherent user interaction through the usage of OS-level UIs.

2. Architecture of an XROS

XR applications have stringent requirements that are difficult to achieve with current system architectures. By considering portions of the XR pipeline as resources to be shared between applications, the XROS enables such requirements. In this section, we first list XR application requirements, before detailing the architecture of the XROS.

1. Requirements. XR applications superimpose digital content over the users’ perception of the physical world. As such, their requirements are on par with the resolution of the users’ perceptory system. Common guidelines cite a guaranteed minimum of 60 FPS, with 2k to 4K resolution, and a motion-to-photon latency below 20 ms for seamless XR experiences (Lai et al., 2020). Such requirements are critical to prevent discrepancies (e.g., alignment) between the user’s perception of the physical world and the digital content and preserve the immersion. Besides purely performance-related requirements, XR as an ubiquitous technology requires to share the physical-digital world model between applications, and even users. As such, it makes sense for the model to be handled by the OS as a shared resource between applications, and devise user interaction methods coherent with such model.

2. Architecture. An XROS enables the above requirements by (1) bringing computation closer to the hardware, (2) finely managing the execution timing of core XR operations, and (3) sharing the physical-digital world model among users and applications. As such, the XROS moves from the current interruption-driven task execution model to a steady-state system where sensor data are processed under a periodic basis. We propose an architecture revolving around six primary components as presented in Figure 0(a):

Environment Understanding: XR applications build a representation of the users’ surroundings through pervasive sensing. It is necessary to integrate such techniques at OS-level to improve performance, enable XR in other OS modules, share the physical world’s model among services and applications, and provide standardised services to applications.

Specialised Chips Drivers:Scene understanding relies on computation-heavy algorithms. A multitude of chips enable faster execution of such operations, from specialised chips to more general purpose elements such as tensor processing units. An XROS should leverage these chips to accelerate the execution of core elements of the pipeline.

Network: Despite specialised hardware, some operations remain too intensive for mobile devices. It is thus necessary to offload these operations to more powerful remote servers. Shared and Persistent experiences also require the transmission and recombination of multiple individual perceptions of the environment towards building a global view of the perpetually evolving physical world. Ultra-low latency transmission will be a primary challenge of the XROS.

User Interaction: A tight integration between Environment Understanding and User Interaction would enable novel interaction methods using the physical-digital duality of XR. Such synergy would allow users to leverage the physical world for interacting with digital content, whether embodied (e.g., gestures), or external (e.g., tangible interfaces).

Display: Together with User Interaction, Display can benefit from the OS’s Environment Understanding to blend digital content with the physical world while avoiding obstruction of critical physical objects (e.g., moving vehicles) or information overload in cluttered areas (e.g., busy streets).

Privacy: To address the pervasive sensing of potentially private user and bystander information, a layer of privacy protection (whether software or hardware) should perform data minimisation between the sensors output and the other components’ input. This layer would also further filter out data for network transmission and display.

Similar to mobile OSes based on UNIX, the remaining parts of the XROS (file system, process and memory management, networking) can be provided by open source components.

These components intercommunicate to reinforce integration of content with the physical-digital world, as shown in Figure 0(b). The Environment Understanding module offloads computation-heavy operations through the Network module, which compensates latency together with the User Interaction module. The results from the Environment Understanding module are transmitted to the User Interaction module to provide interaction methods that leverage the physical-digital duality of XR. Finally, the User Interaction module prepares the content for visualisation, which is combined with the world model from the Environment Understanding module to position content in the Display module.

3. Environment Understanding

XR relies on pervasive environment sensing to blend digital content with the physical world. In this section, we focus on the visual understanding of the physical world, reinforced by information from other sensors (depth cameras, LiDar) (Campos et al., 2021). To power XR applications and other OS modules, the environment understanding module should provide the base XR primitives and share the access to world model.

1. XR Primitives. An XROS should provide XR primitives to applications and services. Current OS-specific AR frameworks such as ARCore222https://developers.google.com/ar or ARKit333https://developer.apple.com/augmented-reality/

combine sensor data to calculate and expose the 3D world model. They also provide functions such as plane identification, anchor recognition, and the possibility to save or share the experience. Although these frameworks provide a starting point for the XROS, none of them disclose how the feature points are identified, making the development of recognition and segmentation algorithms more difficult and redundant with the frameworks’ algorithms. An XROS should explicitly expose the feature points extracted from the camera frames and how these points are refined through other sensors. The XROS would perform feature extraction and tracking and expose the results to every layer of the pipeline, while world building and object recognition could be implemented as standardised services to prevent redundancy and improve performance.

2. Shared Access to the world’s model. One of the key functionalities of an operating system is to ensure the shared access to software and hardware resources for applications. The feature points, world model, semantic areas, and recognised objects are one of such resource. With a single physical world to share between applications, the OS needs to decide which application may display content over each area of the physical world (Schmalstieg et al., 2002). Such task does not only involve sharing the physical-digital world in-between applications, but also deciding which areas from the physical world may not be overlaid with content. The OS may define contextual rules based on user safety, information density and visibility, or other cues driven by the user’s habits.

3. Specialised Hardware. Integrating complex image and sensor processing operations in the OS may significantly impact the system’s operation. Some operations are computation-heavy, while others rely on manipulating large data structures, and the pipeline should be processed under a steady-state basis. Specialised hardware may handle repetitive operations more efficiently. We consider three types of hardware: (1) highly specific single-operation chips for data preprocessing (Nikolic et al., 2014), (2) specialised hardware tapping into the memory of the processor to perform more complex operations (Nguyen et al., 2019)

, and (3) less specialised hardware for executing general functions (rendering, machine learning, rendering). Nowadays, most mobile devices embed chips belonging to the third category. The XROS should leverage such chips to relieve the CPU load and minimise the impact of errors.

4. Networking

The previous section raised the need for on-device scene understanding to support XR applications. Remote scene understanding is also critical to execute XR operations that are either too computationally heavy, or to support persistent and shared experiences among users. The XROS would therefore be distributed in nature, supplementing standalone operation with data from nearby devices and the computation power of remote servers.

1. Computation Offloading.

The XROS may offload computation-heavy tasks to remote machines. Using background microservices, the XROS can estimate the computational cost of XR operations, maintain connections to remote servers, provide handover on user’s coverage change 

(Braud et al., 2021), and decide when to offload tasks. It should also offer developers the possibility to specify whether a specific task should be executed on device or in the cloud (Kosta et al., 2012). Figure 2 illustrates how computation offloading and handover management could take place in 5G with a 2-tier server architecture. Background microservices detect when the user moves out of the eNB’s range to the gNB, and perform handover and migration.

Figure 2. Handover and multi-tiers offloading in XROS.

2. Shared and Persistent Experiences. One of the main projects for large-scale XR environments is the persistence of the XR world, allowing multiple users to share the same experience at the same location. Such applications aggregate the environment model of multiple devices to build a global model while enabling simultaneous interaction of multiple users with the digital content. The location-awareness of edge servers may be leveraged to perform this aggregation at a computationally acceptable scale (Zhou et al., 2020). Device-to-device communication may also be used for immediate communication between users (Ansari et al., 2017). The XROS will thus face the challenge of relying on external computation providers for aggregating data and providing computation offloading.

3. Content transmission. The XROS will rely on distant servers, at the edge or in the cloud, to offload computations and provide shared experiences. XR presents significant bandwidth, latency and reliability constraints that need to be addressed. Latency and bandwidth can be optimised through efficient usage of the available resources (Braud et al., 2020). The transmission may experience packet loss owing to either congestion in the core network or channel error in the access network. An XROS should thus monitor the network conditions, and dynamically adapt the recovery techniques (retransmission or forward error correction). Similar to WebRTC, the XROS would obtain a better trade-off between temporal quality (smoothness of rendering), spatial video quality and end-to-end delay by utilising an adaptive hybrid NACK/FEC method (Holmer et al., 2013). The XROS should also adapt the transmission rate according the available bandwidth. To this end, the XROS would either re-implement/redesign WebRTC (application protocol) to function as a transport protocol, or brings a novel protocols that integrate adaptable video rate and efficient redundancy to mitigate packet losses.

4. Latency Compensation. Although computation offloading allows to execute sophisticated operations on constrained hardware, it comes at the cost of network latency. Such latency adds up to the rest of the pipeline, increasing the total motion-to-photon latency (the latency between user motion and its impact on the display), and thus reducing the immersion (Braud et al., 2017). Similarly, shared experiences Latency compensation techniques should be designed to hide such latency from the user. Services may track the features of captured images until object recognition results are received to adjust to the device orientation (Zhang et al., 2022). Geometric latency compensation may also be used to provide wider interaction targets in case of increased motion to photon.

5. Novel Interactivity with XROS

The XROS goes beyond the 2D user interfaces dominated by desktops, windows, menus and icons, and emphasises the interactivity with users’ physical bodies (e.g., gestures and bio-signal), surroundings (e.g., physical objects and bystanders), and other users (Jacob et al., 2008). Accordingly, we categorise the requirements of user interactivity as follows:

1. Blended UIs in the physical world. The definition of augmented reality has been evolving from overlaying simple information to blended environments combining physical and digital seamlessly (Speicher et al., 2019). Novel UIs across the digital and physical realities can serve numerous users to experience high levels of immersion and virtuality, in heterogeneous environments (Figure 5). The XROS should own the capacity of facilitating multiple users to collaborate with such virtual interfaces superimposed on the physical world. Such blended UIs should be ubiquitous, potentially up to city-scale (Lee et al., 2021). The challenges of achieving the city-wide requirements would need interdisciplinary efforts, primarily driven by a distributed XROS with advanced mobile networks to synchronise multitudinous objects and their corresponding blended UIs through XR technology.

Figure 3. Virtual-Physical User Interaction in XROS555Images from the Ego4D dataset samples https://ego4d-data.org/fig1.html.

2. Enhanced Awareness to Users and Their Environments. To achieve blended UIs, the digital overlays should match with objects, locations and people in the physical world.The XROS takes advantage of context awareness to understand the user’s situated environment (Naqvi et al., 2015). The XROS can achieve context awareness by recognising visual markers (Zhang et al., 2022) or 3D features on physical objects/users’ body gestures (Hariharan et al., 2015). However, such recognition requires significant computational resources. Similar to the X windows system666https://x.org/wiki/, the XROS may consider thin clients to display the overlay while UI operations are performed on a separate machine.

3. Mobile User Interactivity. The XROS provides users with the fundamental input and output (I/O) functions. Users receive enriched AR information, whether visual, audio, or haptic (Bermejo et al., 2021)). The key bottleneck of XR displays is the user’s cognitive load. Displaying content without proper selection and management would cause information overflow and poor usability (Lindlbauer et al., 2019). Thus, the XROS has to maintain highly relevant content (Lam et al., 2021) inside the size-limited lens of XR (Lee et al., 2020). The XROS should also adapt to the vastly diversified input interfaces, whether embodied or physical.

4. Towards Tangible Interaction As a final note, the blended UIs of the XROS do not limit to physical-digital mixed overlays. Tangible interaction will redefine our spatial environments. Remarkable examples of tangible interaction include public displays, interactive tabletops, mechanised infrastructures, at-home IoT devices, drones, augmented surface of electronics (Shaer and Hornecker, 2010). Thus, how to manage countless distinct tangible items remains a critical challenge.

Figure 5 presents several examples of user interaction between digital and physical enabled by the XROS.

6. Security and Privacy

Figure 4. Privacy protection architecture filtering the application’s input (sensor data) and output (display, haptic) flows with a privacy enforcement layer.

XR operates through pervasively and continually sensing the users’ physical surroundings. Therefore, it faces multiple security and privacy challenges depending on the technologies employed (Roesner et al., 2014). In this section, we focus primarily on spatial data collection, whether on users or bystanders, and consider both the input and output flows as shown in Figure 4.

1. Spatial Data Collection Spatial data has become more ubiquitous with the raising adoption of XR technologies in mobile environments. This spatial information allow XR applications to acquire better and more precise understanding of the surroundings, but also opens new threats to users’ and bystanders’ privacy. Such data can be used to infer spaces from the captured 3D point clouds, and allow an attacker to recognise space/objects belonging to a specific user despite the lack of visual images (Guzman et al., 2021). An XROS should take particular care to protect such data against potential threats.

2. Bystander Privacy XR’s pervasive sensing may affect the users’ privacy. Many legal frameworks require data collection systems to give data subjects control over their personal data (1). As non-users, bystanders often do not have such control, which complicates both the legal and ethical application of XR. An XROS should automatically detect and minimize data from bystanders (Dimiccoli et al., 2018) to respect their privacy and adhere to legal frameworks.

3. Input Data The XROS should include an input protection layer as an intermediary framework between raw sensors data and the XR applications. We build on top of (Hu et al., 2021), where the protection layer splits the access control and processing of the sensed data. This approach relies on three different modules: sensor collection, sensor data, and network manager. Each sensor includes a sensor collection module, where developers access the sensed data on-device. The sensed data is stored in the database module, where the applications can access and process the collected data. The network process allows the XR applications to share the sensed data with external parties (e.g., cloud computing, multiplayer game). Access to data by external parties (applications, remote server) is regulated by a privacy manager.

4. Output Data and Safety Malicious XR output can raise new privacy threats (Lebeck et al., 2017; Roesner et al., 2014), for instance, displaying content over physical objects that present a risk for the user. Therefore, the XROS includes a policy manager with a set of well-defined parameterized conditions and actions to perform in the XR output. In (Lebeck et al., 2017), the authors propose an XR platform that controls the output that applications can display according to a set of context-aware policies. The output policies constrains the virtual content that users can see in the XR applications. The XROS policy manager should also handle non-visual outputs such as audio and haptic, as well as the display priority when several applications are sharing the physical world’s real estate. Besides safety issues, output data should undergo another layer of minimisation before being displayed or transmitted to remote machines.

7. Conclusion

This paper detailed the architecture of DiOS, an XROS architecture that addresses the stringent requirements of pervasive multi-users XR, enabling exciting developments such as the metaverse. The XROS integrates the main XR primitives at its core to provide other OS modules and applications with the base functions of XR. The XROS considers the physical world as a resource to be shared between applications, both in its operation and in terms of user interaction. Besides the vision algorithms driving XR functions, the XROS presents the particularity to be distributed in nature, to provide improved computation capabilities, and share the experience across multiple users. Finally, we detailed the privacy and security aspects of the XROS, to address the increasingly growing ethical and legal concerns with XR technology.

References

  • [1] (2018-05-25)(Website) European Commission. External Links: Link Cited by: §6.
  • R. I. Ansari, C. Chrysostomou, S. A. Hassan, M. Guizani, S. Mumtaz, J. Rodriguez, and J. J. Rodrigues (2017) 5G d2d networks: techniques, challenges, and future prospects. IEEE Systems Journal 12 (4), pp. 3970–3984. Cited by: §4.
  • C. Bermejo, L. H. Lee, P. Chojecki, D. Przewozny, and P. Hui (2021) Exploring button designs for mid-air interaction in virtual reality: a hexa-metric evaluation of key representations and multi-modal cues. Proc. ACM Hum.-Comput. Interact. 5 (EICS). Cited by: §5.
  • T. Braud, A. Alhilal, and P. Hui (2021) Talaria: in-engine synchronisation for seamless migration of mobile edge gaming instances. In International Conference on emerging Networking EXperiments and Technologies (CoNEXT), Cited by: §4.
  • T. Braud, F. H. Bijarbooneh, D. Chatzopoulos, and P. Hui (2017) Future networking challenges: the case of mobile augmented reality. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 1796–1807. Cited by: §4.
  • T. Braud, P. Zhou, J. Kangasharju, and P. Hui (2020) Multipath computation offloading for mobile augmented reality. In 2020 IEEE International Conference on Pervasive Computing and Communications (PerCom), pp. 1–10. Cited by: §4.
  • C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. Montiel, and J. D. Tardós (2021) ORB-slam3: an accurate open-source library for visual, visual–inertial, and multimap slam. IEEE Transactions on Robotics. Cited by: §3.
  • M. Dimiccoli, J. Marín, and E. Thomaz (2018)

    Mitigating bystander privacy concerns in egocentric activity recognition with deep learning and intentional image degradation

    .
    Proc. of the ACM on IMWUT 1 (4), pp. 1–18. Cited by: §6.
  • J. A. d. Guzman, A. Seneviratne, and K. Thilakarathna (2021) Unravelling spatial privacy risks of mobile mixed reality data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 5 (1), pp. 1–26. Cited by: §6.
  • B. Hariharan, P. Arbeláez, R. B. Girshick, and J. Malik (2015) Hypercolumns for object segmentation and fine-grained localization.

    2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    , pp. 447–456.
    Cited by: §5.
  • S. Holmer, M. Shemer, and M. Paniconi (2013) Handling packet loss in webrtc. In 2013 IEEE International Conference on Image Processing, Vol. , pp. 1860–1864. Cited by: §4.
  • J. Hu, A. Iosifescu, and R. LiKamWa (2021) LensCap: split-process framework for fine-grained visual privacy control for augmented reality apps. In Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services, pp. 14–27. Cited by: §6.
  • R. J.K. Jacob, A. Girouard, L. M. Hirshfield, M. S. Horn, O. Shaer, E. T. Solovey, and J. Zigelbaum (2008) Reality-based interaction: a framework for post-wimp interfaces. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems, pp. 201–210. External Links: ISBN 9781605580111 Cited by: §5.
  • S. Kosta, A. Aucinas, P. Hui, R. Mortier, and X. Zhang (2012) Thinkair: dynamic resource allocation and parallel execution in the cloud for mobile code offloading. In 2012 Proc. IEEE Infocom, pp. 945–953. Cited by: §4.
  • Z. Lai, Y. C. Hu, Y. Cui, L. Sun, N. Dai, and H. Lee (2020) Furion: engineering high-quality immersive virtual reality on today’s mobile devices. IEEE Transactions on Mobile Computing 19 (7), pp. 1586–1602. Cited by: §2.
  • K. Y. Lam, L. H. Lee, and P. Hui (2021) A2W: context-aware recommendation system for mobile augmented reality web browser. In Proc. of the 29th ACM International Conference on Multimedia, MM ’21, pp. 2447–2455. External Links: ISBN 9781450386517 Cited by: §5.
  • K. Lebeck, K. Ruth, T. Kohno, and F. Roesner (2017) Securing augmented reality output. In 2017 IEEE symposium on security and privacy (SP), pp. 320–337. Cited by: §6.
  • L. Lee, T. Braud, S. Hosio, and P. Hui (2021) Towards augmented reality driven human-city interaction: current research on mobile headsets and future challenges. ACM Comput. Surv. 54 (8). External Links: ISSN 0360-0300, Link, Document Cited by: §5.
  • L. Lee, T. Braud, K. Lam, Y. Yau, and P. Hui (2020) From seen to unseen: designing keyboard-less interfaces for text entry on the constrained screen real estate of augmented reality headsets. Pervasive Mob. Comput. 64, pp. 101148. Cited by: §5.
  • L. Lee, T. Braud, P. Zhou, L. Wang, D. Xu, Z. Lin, A. Kumar, C. Bermejo, and P. Hui (2021) All one needs to know about metaverse: a complete survey on technological singularity, virtual ecosystem, and research agenda. External Links: 2110.05352 Cited by: §1.
  • D. Lindlbauer, A. M. Feit, and O. Hilliges (2019) Context-aware online adaptation of mixed reality interfaces. Proc. of the 32nd Annual ACM Symp. on UIST. Cited by: §5.
  • N. Z. Naqvi, K. Moens, A. Ramakrishnan, D. Preuveneers, D. Hughes, and Y. Berbers (2015) To cloud or not to cloud: a context-aware deployment perspective of augmented reality mobile applications. In Proc. of the 30th Annual ACM Symp. on Applied Computing, SAC ’15, pp. 555–562. External Links: ISBN 9781450331968 Cited by: §5.
  • D. T. Nguyen, T. N. Nguyen, H. Kim, and H. Lee (2019) A high-throughput and power-efficient fpga implementation of yolo cnn for object detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 27 (8), pp. 1861–1873. Cited by: §3.
  • J. Nikolic, J. Rehder, M. Burri, P. Gohl, S. Leutenegger, P. T. Furgale, and R. Siegwart (2014) A synchronized visual-inertial sensor system with fpga pre-processing for accurate real-time slam. In 2014 IEEE international conference on robotics and automation (ICRA), pp. 431–437. Cited by: §3.
  • F. Roesner, T. Kohno, and D. Molnar (2014) Security and privacy for augmented reality systems. Comm. of the ACM 57 (4), pp. 88–96. Cited by: §6, §6.
  • D. Schmalstieg, A. Fuhrmann, G. Hesina, Z. Szalavári, L. M. Encarnaçao, M. Gervautz, and W. Purgathofer (2002) The studierstube augmented reality project. Presence: Teleoperators & Virtual Environments 11 (1), pp. 33–54. Cited by: §3.
  • O. Shaer and E. Hornecker (2010) Tangible user interfaces: past, present, and future directions. Now Publishers Inc. Cited by: §5.
  • M. Speicher, B. D. Hall, and M. Nebeling (2019) What is mixed reality?. In Proc. of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–15. External Links: ISBN 9781450359702 Cited by: §5.
  • W. Zhang, S. Lin, F. Bijarbooneh, H. Cheng, T. Braud, P. Zhou, L. Lee, and P. Hui (2022) EdgeXAR: a 6-dof camera multi-target interaction framework for mar with user-friendly latency compensation using edge computing. In Proc. of the ACM on HCI (EICS), Cited by: §4, §5.
  • P. Zhou, T. Braud, A. Zavodovski, Z. Liu, X. Chen, P. Hui, and J. Kangasharju (2020) Edge-facilitated augmented vision in vehicle-to-everything networks. IEEE Transactions on Vehicular Technology 69 (10), pp. 12187–12201. Cited by: §4.