Log In Sign Up

Security and Privacy in Virtual Reality – A Literature Survey

by   Alberto Giaretta, et al.

Virtual Reality (VR) is a multibillionaire market that keeps growing, year after year. As VR is becoming prevalent in households and small businesses, it is critical to address the effects that this technology might have on the privacy and security of its users. In this paper, we explore the state-of-the-art in VR privacy and security, we categorise potential issues and threats, and we analyse causes and effects of the identified threats. Besides, we focus on the research previously conducted in the field of authentication in VR, as it stands as the most investigated area in the topic. We also provide an overview of other interesting uses of VR in the field of cybersecurity, such as the use of VR to teach cybersecurity or evaluate the usability of security solutions.


SoK: Data Privacy in Virtual Reality

The adoption of virtual reality (VR) technologies has rapidly gained mom...

Inspection of histological 3D reconstructions in virtual reality

3D reconstruction is a challenging current topic in medical research. We...

Metaverse: Security and Privacy Concerns

The term "metaverse", a three-dimensional virtual universe similar to th...

Going Incognito in the Metaverse

Virtual reality (VR) telepresence applications and the so-called "metave...

Security, Privacy and Safety Risk Assessment for Virtual Reality Learning Environment Applications

Social Virtual Reality based Learning Environments (VRLEs) such as vSoci...

Reducing the Human Factor in Virtual Reality Research to Increase Reproducibility and Replicability

The replication crisis is real, and awareness of its existence is growin...

I Introduction

Extended Reality (XR) is a paradigm that refers to different types of human-machine interaction within real and virtual environments, enabled by computer technology and wearable devices. Under the umbrella of XR, Virtual Reality (VR) is a computer-generated simulation of a 3D environment that enables realistic physical interactions by means of technology and wearable devices. The popularity of VR has grown steadily over the years; according to Oberlo [oberlo], the number of US consumers that used VR went from 16% in 2019 to 19% in 2020. Of those users, shows a report from eMarketer [emarketer], 52.1 million people use VR technology at least once per month. Furthermore, the growth is expected to continue in the near future. By 2028, the compound annual growth rate of the VR market is expected to reach 18% [grandview], according to statistics by Grand View Research. In terms of market value, Valuates Reports [valuatesVR] projects that the global VR market size will reach S$ 26860 million by 2027, from US$ 7719.6 million in 2020.

By design, VR is capable of collecting a large amount of non-verbal information, such as users’ movements, biometrics, and usage patterns. As shown in Figure 1, VR devices are equipped with different sensors that collect both explicit input and non-verbal information, which is used by the VR engine to model the virtual world according to the user’s actions. As noted by Bailenson [bailenson2018]

, non-verbal information is uniquely telling and can be used for different goals, from tailoring advertisements, to determining if users are low or high performers. Hours of personal use, paired with unclear policies regarding data handling and learning algorithms, could allow companies to estimate users’ preferences and infer their characteristic behaviour. Taking into account the consistent and considerable growth of VR, alongside its pervasive nature, it is of paramount importance to evaluate any privacy and security concern that could arise. In this paper, we provide a literature survey on VR threats and issues, in terms of privacy and security, with the aim of highlighting what has been done and what is still open for research.

Fig. 1: The Virtual Reality (VR) paradigm. Users perceive the VR environment using head-mounted displays (HMDs) and interact with it via controllers or more advanced techniques, such as finger tracking sensors. Every device is equipped with various sensors to provide useful information to the VR engine, which creates the virtual world and modifies it according to the input received from the user.

Within the umbrella term XR, besides VR, exist other types of extended reality. Namely, Augmented Reality (AR) and Mixed Reality (MR). In AR, virtual objects are added to the real environment for enriching it;a notable example is pointing a smartphone to an art piece and getting information about the piece itself. MR is a combination of VR and AR, where the interactions do not happen exclusively in the virtual space nor in the real one, but in a hybrid fashion. AR and MR exhibit their own set of challenges and issues, in terms of security and privacy, which are not necessarily similar to those found in VR. In 2014, Roesner et al. [roesner2014] conducted a survey on AR security and privacy concerns, while King et al.[king2020] focused specifically on the privacy issues of AR applications. Regarding MR, De Guzman et al. [deguzman2019] conducted an exhaustive survey on the security and privacy issues that affect it. However, to our knowledge, a comprehensive survey on the security and privacy aspects of VR is lacking. In this paper, we aim to fill this gap, providing four main contributions:

  • An overview of privacy issues that affect VR, organised by typology and causes;

  • An analysis of threats to VR security, categorised in traditional attacks that also affect VR, such as denial of service (DoS), and attacks specific for VR, such as dizziness inducing ones;

  • An in-depth exploration on the research done on authentication aspects in VR;

  • An overview of other uses of VR in the field of cybersecurity and physical security, such as teaching cybersecurity concepts or evaluating the usability of security solutions.

I-a Outline

This paper is organised as follows. In Section II, we discuss issues and threats to privacy within the VR domain. Similarly, in Section III we discuss the threats to the security of VR and we dedicate Section IV to the topic of authentication for VR. In Section V we discuss different uses of VR within the cybersecurity domain, such as teaching security concepts, training physical security, and evaluating the usability of security processes. Finally, in Section VI we provide some final remarks and provide our conclusions.

Ii Privacy Issues in VR

Fig. 2: Privacy issues in VR. In purple the three main categories, further specified in subcategories and specific concerns.

VR technology raises a number of concerns about the privacy of its users. In this section, we provide an overview of the different threats to privacy in the context of VR, following the structure shown in Figure 2. In 2018, Adams et al. [adams2018] identified to which extent users and developers are concerned about privacy in VR. They conducted interviews with 20 participants, 10 users and 10 developers, showing that 6 users and 6 developers raised concerns on the data collection practises, while 7 participants out of 20 expressed general mistrust in headset producers. The authors also highlighted that developers show consideration for users’ privacy, but that there is a marked disconnection between general concerns and concerns about their own products. One of the interviewees expressed concern for the perception capabilities of the VR sensors and cameras. As shown by Kreylos [kreylos2017], Oculus Rift sensors are regular webcams, equipped with a firmware and a physical filter that allow infrared light to pass, while removing visible light. However, Kreylos showed that it is possible to operate software manipulations and extract realistic pictures from the Oculus camera, even though it is meant to perceive only the infrared spectrum.

O’Brolcháin et al. [obrolchain2016] discuss the privacy concerns that derive from the combination of VR and social networks. They divide threats to privacy and privacy issues into three categories:

  • Informational privacy, in particular the increased vulnerability of data or its misuse;

  • Physical privacy, in particular the prevalence of recording devices or the unintended revelation of physical information (such as physical reactions to ads and the loss of anonymity);

  • Associational privacy, in particular the ability to include and exclude people from certain events or the global village, in which everyone could potentially learn important and trivial matters about others.

Although this categorisation focuses on the threats arising from the combination of VR and social networks, it applies well to the threats to privacy for VR, in general. In addition to privacy issues, the authors note that when privacy cannot be ensured, the autonomy of individuals (i.e., their ability to form independent opinions, plans, and goals) is also at stake, because autonomy is intertwined with privacy. Human beings need privacy to think through their problems, make mistakes, and explore different possibilities. Without the guarantee of privacy, users might experience a decrease in autonomy or even lose it.

Ii-a Threats to Associational Privacy

In 2016, Nwaneri [nwaneri2016] noted that Facebook has a track record of experimenting with its users (e.g., showing that removing negative or positive news from feeds affected their mood). They also noted that the Oculus policies were phrased in such a way that it would allow Facebook to conduct similar experiments. The risk to users’ privacy would be even higher than the ones found in social networks, given the kind of data gathered by VR systems. In particular, Nwaneri argues that the policy statement regarding collecting ”information about [the users’] physical movements and dimensions when [they] use a virtual reality headset” could facilitate discrimination and mass surveillance. According to the author, failing to address these points could lead companies to face serious consequences, such as accusations of violating state privacy and wiretapping laws. On a side note, Nwaneri highlights that the long-term effects of immersive sensory experience granted by VR are still unknown. VR environments might cause safety risks in their users, such as seizures, post-traumatic stress disorder, or interference with childhood mental development. Therefore, it is in the best interest of VR companies to push for clear policies and legislation that regulate their business.

Tromp et al. [tromp2018] elaborated more on the issues of combining social networks and VR, to produce multi-user social VR environments. Among various psychological risks, which fall outside the scope of our work, the authors list privacy. Head-mounted displays (HMDs), due to the sensors they are equipped with, record detailed user’s physical and psychological responses to stimuli. These responses could be analyzed automatically and used for targeting users with tailored advertising, a method known as neuromarketing. In their work, they provided some recommendations to different actors involved with VR applications, such as users, parents, software and hardware companies, regulatory institutions, and researchers. For example, it is still unclear whether VR addiction exhibits similar characteristics to internet use addiction. Therefore, psychologists should establish if the same treatments can be used and VR users should be aware that doctors may not yet know how to treat such addiction.

Ii-B Threats to Informational Privacy

Regarding the legislative issues, Lake [lake2020] highlights that the current US legal framework for online identity misappropriation does not adequately protect VR users. In particular, the author notes that the combination of different laws leaves plaintiffs without any party to sue: anonymity laws make it impossible to sue the identity appropriator and ISP immunity also prevents the victim from suing the VR provider. Until these issues are addressed (for example, following some of the suggestions given by Lake), victims of identity theft in VR environments cannot even bring their claims to US courts.

Kumarapeli [kumarapeli2021] highlights that the major players in VR are now focused on producing realistic avatars, tracking facial expressions and user body movement. On the one hand, this can help to enrich communications in VR. On the other hand, the capability of tracking such information - without allowing users to opt out from undesired tracking - can expose several privacy issues. For example, if the system tracks hints of anger during a VR-mediated collaboration and shows that explicitly on the avatar face, the collaboration could be negatively affected. Therefore, it is important to adopt strategies that allow for enriching avatars’ behaviour, while filtering undesired emotions and informing the users about the emotions currently detected.

In 2020, Miller et al. [miller2020personal]

wrote that a considerable number of works on user authentication in VR assume a positive connotation. In other words, many papers explore the identification of users as a way for providing seamless and non-invasive authentication. However, the potential shown by VR data to provide authentication implies the potential to identify and de-anonymise users against their will. In their work, they showed that VR motion data can be used to identify users without requiring them to perform any specific action, contrary to what VR authentication studies have done before. The authors recruited 511 participants and showed them five 360-degrees videos (randomly picked from a pool of 80 videos), each 20s long. The participants did not have to perform any particular task and were free to look around as they preferred. The results of the study show that users can be identified on the basis of their motion data with an accuracy of 87,5%, using a simple Random Forest algorithm. The authors also noted that the most important features included height, posture, distance from VR content, focal length, and placement of the hand controllers at rest. In conclusion, just collecting data of VR interactions shows high probabilities for identification, even if such interactions were not designed with identification intentions. Therefore, one of the future challenges for VR will be to design non-identifying interactions. For example, manipulating positional data (e.g., altering the height) could obfuscate some physical characteristics and help to protect users’ privacy.

Finally, Maloney et al. [maloney2020] show that most social VR platforms (such as AltspaceVR, RecRoom, and VRchat) do not clearly inform users about which collected information is private and which public. After conducting a survey among VR users regarding their privacy-related behaviour on social VR platforms, the authors drew some interesting conclusions. First, users showed a similar tendency to share their experiences and emotions in VR, compared to traditional social networks. Second, social VR platforms aim to provide to their users an immersive experience by using resembling avatars. However, this leads to the leakage of some personal information, such as height, race, and appearance in general.

Ii-C Threats to Physical Privacy

Buck and Bodenheimer [buck2021] point out that the way that users interact with the surrounding environment can be a critical information itself. The authors highlight that research has neglected the relationship between users and personal space in VR environments. In particular, personal space is plastic and varies depending on the interaction at hand: for example, personal space restricts when users approach objects they enjoy and expands when the interaction is not appreciated. This knowledge could potentially be used by companies to gather information about personal preferences, done by positioning an item in the environment and verifying whether users approach it or walk away from it. In turn, this could allow to identify and profile people without their knowledge nor consent. Moreover, the psychology literature shows that stress increases when interactions happen within personal space, hence personal space should be calculated and taken into account for designing pleasant virtual interactions. In conclusion, personal space is easy to detect and use to infer information about VR users, and this poses the nontrivial problem of how to collect and treat this type of data.

In general, any physical movement in VR can lead to privacy issues. Falk et al. [falk2021]

prove feasible a de-anonymization attack on VR which correlates VR avatars to their human users, leveraging human unique movement patterns. Named ReAvatar, this attack uses only the movement information provided by the VR setup, available without any peculiar privilege or malicious code injection. In their experiments, both the attacker and the victim join a VR game (i.e., VRChat), where people are embodied as avatars and can interact with other people’s avatars. The goal of the attacker is to discover who is hiding behind a certain avatar by analysing the movement patterns of the observed victim. To do so, the attacker asks the victim to perform common and apparently innocuous movements, such as dancing or picking up objects. The attacker observes the victim’s avatar and extracts the features of their movement. Then, the authors implement pose estimation using part affinity fields (PAF), detect keywise pairs of poses, and extract the local coordinates of

(x,y,z) from the victim avatar. Finally, each joint pair is evaluated against a threshold that determines if the avatar corresponds to any known person. In a limited study of 6 avatars and 5 users, ReAvatar achieved 89,60% accuracy, proving that the attack is at least feasible.

Although beyond the scope of this review, it is worth noting that eye-tracking technologies are increasingly integrated into modern VR HMDs. Eye-tracking data have the potentiality of revealing more information than intended about the users. For example, Partala et al. [partala2000] showed that pupil size varies according to emotionally provocative sound stimuli. Their experiments suggested that the size of the pupil discriminates during and after different variations of emotional stimuli. This means that eye-tracking technologies embedded in VR systems might allow to extrapolate emotional information, without explicit knowledge or consent from the users.

Ii-D Solutions for Strengthening Privacy in VR

Bozkir et al. [bozkir2019]

proposed a method to assess the cognitive load of users during a driving simulation experience, while preserving their privacy. To do so, they used critical and non-critical time frames, trained multiple classifiers (SVM, DTs, RFs, and kNN with k=1,5,10) and validated their proposal with a leave-one-person-out cross-validation (LOOCV) approach. The authors used a small number of short time frames, which allows them to predict cognitive load in real time using minimal data, thus preserving users’ privacy. Their experimental hardware setup included a HTC-Vive, a Logitech G27 Steering Wheel and Pedals, Phillips headphones, and a Pupil-Labs HTC-Vive binocular add-on for eye tracking.

Regarding the use of eye tracking technology, John et al. [john2019, john2020] note that the trend of equipping VR headsets with eye tracking sensors puts users’ privacy in danger. Eye-trackers collect images of the users’ iris patterns and iris scans, often used as biometrics for a variety of services, can identify people with high accuracy. To counteract this issue, John et al. [john2019] showed that Gaussian filters can be applied to iris images to blur iris patterns, making them unrecognisable while preserving gaze detection capabilities. Later in 2020, John et al. [john2020] investigated a different approach, hardware-based. Their idea is to attach the eye tracker to a small telescopic arm, allowing users to manually defocus images of the iris at will. In addition to being effective, their approach has the valuable feature of allowing users to directly control when they want to show their iris (e.g., if needed for authentication purposes) or when they want to hide it.

Iii Threats to VR Security

Fig. 3: Security issues in VR. In purple the three main categories, further specified in subcategories and specific concerns.

Adams et al. [adams2018] interviewed 20 participants, 10 users and 10 developers, investigating their perception of security in VR. Only 2 users raised concerns about malicious applications, 4 participants in total worried about data sniffing, and 3 users said that security ”is not there yet”, which suggests that not many end-users might have cybersecurity concerns. Given this attitude, the authors are of the opinion that many early VR applications will exhibit different security vulnerabilities, as cyber-security does not hold a high perceived value for end-users. Therefore, it is critical to assess which attacks could be expected and mounted on VR. In this section, we provide an overview of the different attacks that can affect VR, categorized according to the tree shown in Figure 3. First, we discuss VR-specific attacks that are made possible by the very nature of VR and the involved sensors. Then, we cover the generic security issues that can affect VR, as they affect various information systems.

Iii-a VR-Specific Security Threats

Casey et al. [casey2019] noted that VR, due to its immersive nature, can have a considerable impact on people; therefore, security is of paramount importance. In their paper, the authors formulate four VR-specific attacks, classified as Immersive attacks, as they leverage the unique immersive characteristics delivered by VR systems. These attacks are:

  • Chaperone attacks manipulate the Virtual Environment (VE) boundaries (i.e., tampers with the walls drawn in the VR scene) to make the collision avoidance fail and put users’ safety at risk;

  • Disorientation attacks aim to cause dizziness and confusion in VR users, causing a condition referred to as cybersickness. Dennison et al. [dennison2016] define cybersickness as a form of motion and simulation sickness due to physiological factors of the users, in correlation with their immersion and presence in VR environments;

  • Human Joystick attacks aim to control the physical movement of a user, such that they move to a predefined spot without realizing it;

  • Overlay attack, where an attacker inserts in the VE unwanted objects, such as images and videos.

In addition to the four Immersive attacks, Casey et al. [casey2019] list two additional possible attacks. The first is the Camera Stream and Tracking Exfiltration attack, where the attacker gains access to the HMD live stream and the HMD front-facing camera stream. The second is the Man-In-The-Room attack, which consists of an attacker joining the target’s VR environment while remaining invisible, allowing to extract information regarding the user or maliciously manipulate the environment.

The authors analyzed OpenVR, an SDK developed by Valve, to identify how an attacker could perform the attacks that they identified. Using an HTC Vive headset and an Oculus Rift, the authors proved that these attacks could be carried out. They found that Chaperone, Disorientation, and Human Joystick attacks could be struck by modifying a JSON configuration file. For the Chaperone attacks, the authors modified the virtual room setup stored in the JSON file. To strike a Disorientation attack, they modified two parameters that control players’ orientation, stored in the same JSON file. The same parameters were manipulated to carry out a Human Joystick attack, even though the sequence and the rate of modifications are different. They also proved possible the Overlay attack, due to the fact that any OpenVR application is allowed to call an overlay and that no mechanism for force closing overlays existed. The authors suggested that this could be exploited to show undesired advertisements or even perform ransomware attacks that prevent users from using their VR system. Last, the authors were able to extract information from the video buffer of the HTC Vive; the Oculus Rift was not used as it was not equipped with a front camera. In conclusion, every attack was successful and the reason was OpenVR lack of robustness: critical information was stored in plain-text, permissions were not defined or not restrictive enough, and applications were insufficiently isolated without sufficient integrity checks. Casey et al. suggest that Arya [lebeck2017, lebeck2018], a policy enforcer created for AR applications, could be adapted to the VR domain.

Regarding Overlay attacks, Ahn et al. [ahn2018] noted that most of the research conducted for AR and VR security has focused more on input privacy (i.e., minimising the sensor data available to third parties) than visual output security. A notable exception is the aforementioned Arya [lebeck2017, lebeck2018]

, proposed by Lebeck et al. The authors showcase that defining rule-based policies for every conceivable AR/VR scenario is an intractable task, using a notable example that assume an application required to create various virtual objects, which should not obstruct the users’ view nor overlap with other important objects (e.g., a virtual avatar). In other words, in this application a new dynamic object should be positioned in the environment in such a way that it is visible but not obstructive. To overcome the complexity of managing this task, the authors proposed Arya, which uses Fog computing and Reinforcement Learning to automatically synthesizes security policies. As suggested by Casey et al. 

[casey2019], their approach can be used to prevent attackers from manipulating a VR application, either to distract and disrupt users or to prevent expected interactions by obstructing important environmental objects. An example of an attack would be a ransomware that obstructs the user’s view with an object until the victim pays a ransom.

Last, VR platform might be vulnerable to various specific side-channel attacks. For example, Al Arafat et al. [arafat2021]

investigated keystrokes on virtual keyboards in terms of security. The authors proposed VR-Spy, a side-channel attack that allows to extract gesture patterns from channel state information (CSI) wave-forms of Wi-Fi signals and then, by applying machine learning, recognize keystrokes from such patterns. The experimental setup consists of a person equipped with a commercially available headset, positioned between 2 commercially off-the-shelf (COTS) wireless devices. Their experiments show that VR-Spy achieves an accuracy of 69,75% when performing virtual keystroke recognition.

Iii-B Generic Security Threats that Affect VR

Beyond security threats that affect specifically VR due to its characteristics, VR systems can also be vulnerable to generic security threats Valluripally et al. [valluripally2021rule, valluripally2020, valluripally2021modeling] focused on modeling and controlling generic attacks that could cause cybersickness in VR users. In their first paper [valluripally2021rule], they introduce a rule-based framework that controls cybersickness by means of 3Qs adaptation. In other words, they propose a system that regulates three aspects:

  • Quality of Application (QoA), a measure of application performance;

  • Quality of Service (QoS), a measure of network resources such as bandwidth and jitter;

  • Quality of Experience (QoE), a measure of user perceived satisfaction.

The framework is useful both in the case of sickness induced by incidental performance issues (e.g., temporary network bottlenecks) or caused by malicious attacks such as DoS. Through continuous monitoring of anomaly events, the framework classifies events into predefined categories and selects the appropriate adaptation strategy. For example, assuming that the system is deployed on Amazon Web Services (AWS), if the framework detects an abnormal high CPU utilisation that affects QoA, an adaptation strategy would be to upgrade the server instance to a better instance (e.g., from AWS t2.micro to AWS c4.large). Their experiments show that adaptations for QoA anomalies can reduce cybersickness by up to 26.43%, while, in the case of QoS anomalies, the proposed adaptation reduces cybersickness by 30.28%. The framework also has the ability to take into account the associated hourly costs for each adaptation strategy, so that the best cost-effective adaptation can be adopted.

In their subsequent work, Valluripally et al. [valluripally2020, valluripally2021modeling] expand on their analysis of sickness-inducing attacks. First, the authors conducted an experiment where they simulate the effects of on-going security attacks and/or privacy attacks while users perform some tasks in VR. The experiment was performed by artificially manipulating the traffic rate and it shows that these attacks cause considerable disruption to different factors that contribute to cybersickness, such as nausea and discomfort. The authors then proved that cybersickness can be correlated with security and privacy issues, regardless if due to attacks, faults, or a combination of both. In particular, they proposed a framework based on an attack-fault tree (AFT) formalism for producing threat models that describe the possible security and privacy attacks. The resulting AFTs are used to perform a quantitative assessment of the probability of disruption caused by each threat. Evidence shows that the threats that induce higher levels of cybersickness in VR-based Learning Environments (VRLEs) are DoS attacks, data leakage, man-in-the-room attacks, and generic unauthorised accesses. Based on the quantitative results, the authors also experimented the effect on cybersickness when different design principles are applied, such as hardening and redundancy. For example, applying hardening and the principle of least privilege reduces the probability of cybersickness by 28,96%, whereas applying redundancy and the principle of least privilege achieves only a small reduction of 3,05%. When all prescribed principles are applied together, the probability of experiencing cybersickness is reduced by 35,18% in total.

Gulhane et al. [gulhane2019] focused on a sub-area of VR, namely, VRLEs. VRLEs are VR spaces designed for delivering virtual learning experiences to their users. For doing so, VRLEs often combine data from emotion tracking sensors with other sensitive information, such as users’ learning progresses and personal data. Therefore, VRLEs pose a number of challenges with respect to the security, privacy, and safety of their users. Gulhane et al. proposed a risk assessment framework that uses attack trees to formalise internal and external vulnerabilities for VRLEs, such as DoS, SQL injections, and unauthorised logins. For example, they use attack trees to formalize the observation that delayed packets (e.g., due to DoS attacks) would lead to compromise the availability of the server, hence its security. The output is a risk score for each analysed property: security, privacy, and safety. Using vSocial [vsocial] (a cloud-based VRLE designed for young people with learning disabilities) as a testbed, the authors evaluated the attack trees produced by their methodology. Each leaf of every attack tree represents a threat and, for each leaf, the authors simulated the corresponding attack. Then, they assigned an impact score for every threat and used the scores for computing complete attack trees.

Finally, it should be noted that VR equipment could facilitate security threats, such as shoulder-surfing attacks where a malicious party spies on the user’s actions to steal their password. Chen et al. [chen2018]

described a computer vision-based attack, struck using VR augmented with additional sensors. In their experiments, the authors showed that it is possible to use a HMD, equipped with an additional ZED stereo camera, for recording the surroundings and stealing the passwords input by users on their devices. The paper shows that it is possible to reconstruct a 3D video from the two two side-by-side videos recorded by the ZED camera. Then, the 3D video can be analysed and used for extracting the 3D trajectories of the fingertip movements to recover the passwords tapped by the victim. The paper shows that there is the potential risk of malicious users that pretend playing some videogames on VR while recording unconscious people moving in the surrounding environment. In particular, the experiments show that there is a strong correlation between the success rate of the attack and the distance from the victim, as well as with the number of input attempts. In addition, some authentication systems require to confirm the input password and such requirement increases the success rate of the proposed attack.

Iv Authentication in VR

Among the areas of research within VR security, the topic of authentication deserves a section of its own. In fact, most of the work related to VR security focuses on identifying or authenticating its users. In this section, we provide an in-depth exploration of the literature and we summarise the works in Table I.

Authentication Approach Works in the Literature
Traditional Authentication George et al. [george2017seamless], Yu et al. [yu2016], Olade et al. [olade2020swipe], Funk et al. [funk2019], Khamis et al. [khamis2018]
Authentication via VR interactions Mathis et al. [mathis2020, mathis2021], George et al. [george2019], Funk et al. [funk2019],
Head gaze and movements Mathis et al. [mathis2020, mathis2021], Funk et al. [funk2019], Rogers et al. [rogers2015], Sivasamy et al. [sivasamy2020], Mustafa et al. [mustafa2018]
Eye-Tracking Olade et al. [olade2020swipe], Mathis et al. [mathis2020, mathis2021], Liebers et al. [liebers2021gaze], Luo et al. [luo2020], Khamis et al. [khamis2018], Lohr et al. [lohr2018, lohr2020], Zhu et al. [zhu2020]
Bio-mechanics Mustafa et al. [mustafa2018], Pfeuffer et al. [pfeuffer2019], Kupin et al. [kupin2019], Ajit et al. [ajit2019], Liebers et al. [liebers2021], Miller et al. [miller2020], Miller et al. [miller2021], Olade et al. [olade2020biomove], Shen et al. [shen2018]
TABLE I: Works on Authentication for VR

In their seminal work, George et al. [george2017seamless] proposed a first step toward integrating traditional authentication approaches into VR, in particular PINs and swipe patterns. They conducted a study with a total of 30 people and showed that both techniques are suitable for transposition into VR. In particular, they showed that the usability in VR matches the usability in the physical world. Furthermore, they showed that the private visual channel granted by VR makes such authentication processes harder to observe, and hence harder to steal.

Similarly, Yu et al. [yu2016] investigated the implementation of two traditional authentication methods, pattern swiping and PINs, while also making a first attempt to create a 3D password for VR. Although the authors did not introduce sufficient details about the 3D password, they provided some intuitions about the relative resilience of the three methods against shoulder-surfing attacks. In the experiments, the participants were recorded while using the three authentication methods and then asked to guess the passwords of other participants, watching the recordings. The results showed that 3D passwords were considerably harder to guess than 2D swipes and PINs, most probably due to the additional complexity added by introducing a third dimension.

Olade et al. [olade2020swipe] further investigated the portability of swipe patterns from mobile devices to VR, noting that George et al. [george2017seamless] did not provide a direct performance comparison between mobile devices and VR. In their experimental setup, the authors implemented the swiping authentication system in terms of 4 different interaction methods: using the hand-held controller (HHC), the HMD, the LeapMotion (a hand-tracking device), and the aGlass (for eye tracking). While the mobile swiping authentication was the easiest and fastest to use, the HHC and the LeapMotion versions were comparable both in terms of speed and usability. Regarding shoulder-surfing attacks, given the same amount of information, attackers have a considerable lower success rate when swiping is done in VR than when it is done on a mobile phone screen.

Mathis et al. [mathis2020, mathis2021] proposed RubikAuth, a three-dimensional authentication method, consisting in a five-faced cube (one face is omitted for improved reachability) that exhibits 1 colour and 9 digits per face. For authenticating themselves, the user rotates the cube with the left HHC and selects n digits as their chosen password. The entropy of a RubikAuth password is equal to , considerably higher than the entropy granted by traditional numerical pins. Additionally, the authors evaluated the impact on performance of different input techniques, concluding that controller tapping is faster than head pose and eye gaze, while the study participants did not express any clear preference regarding the usability of each input approach. As expected, the complexity of the password directly affects memorability, and the higher the number of face switches, the lower the memorability of the passwords.

The authors also evaluated the security of their proposal, using three realistic attackers with different tools: pen and paper to annotate observations, a 3D replica of the cube, and recordings of users performing authentication. The experiments show that having access to such resources increases the chances of guessing a RubikAuth password (and, reasonably, the same applies to other VR authentication modes). That being said, only 8 attacks out of 540 attempts succeeded. No attacker succeeded in inferring any password inserted by eye-gazing, and the 8 guessed passwords were simple patterns that involved only a single cube face. It is worth noting that RubikAuth offers some interesting security properties, which contribute to its resilience against shoulder-surfing attacks. For example, RubikAuth provides the possibility to apply fake face switches to deceive by-standers and, with correct cube positioning, users can pick numbers from different faces without applying any rotation.

George et al. [george2019] investigated whether spatial memory and awareness can be leveraged to improve usability and security in 3D authentication routines. The result is RoomLock, an authentication framework (based on the loci technique) that breaks down a password in 4 different objects, scattered in a 3D virtual scene. In the VR scene, which reproduces the physical environment where the experiment is conducted, users compose their password by pointing and clicking in the correct order on the 4 objects. The authors tested their implementation with a group of 27 participants, demonstrating the memorability of RoomLock passwords. During the study, the participants exhibited a tendency to adopt a link technique, which consists of selecting objects so that their combination forms meaningful stories (e.g., I eat chips, so I drink water). For the security evaluation, the authors assumed two different attackers. The first attacker is a traditional bystander who observes the authentication phase and tries to infer the password. The second attacker records the entire room during the authentication phase of their target, analyses the recordings, and attempts to reconstruct the password. The experiments show that the first attacker has an easier time guessing a password than the second, with a success rate of 12,5% versus 0%. This evidence goes against what previously reported by other researchers, such as Mathis et al. [mathis2020, mathis2021], suggesting that for some authentication approaches immediate threats might be more efficient than shoulder-surfing attacks based on recorded material. Therefore, more in-depth studies are warranted and researchers might want to investigate ad hoc their proposed authentication schemes.

Iv-a Authentication based on head gaze and movements

Funk et al. [funk2019] proposed LookUnlock, a headgaze-based solution to enter passwords on HMDs. LookUnlock can use different kind of spatially-aware passwords, namely, spatial passwords, virtual passwords, and hybrid passwords. To set a spatial password, users move the HMD cursor to different positions in the environment and perform enter actions. For the unlocking phase, users insert their password by placing the head-gaze cursor over the spatial targets previously defined at setting phase. After the first object, the users have up to 3000 ms for detecting the next object, before getting a time-out. LookUnlock adds an additional layer of security to the authentication phase by binding spatial passwords to the environment. In practical terms, the same actions performed in a different environment would not authenticate the user, as the framework would identify that the physical environment differs from the one used at enrolment time. Virtual passwords, the second type of spatially-aware passwords, are virtual 3D objects drawn in the virtual domain which spawn at random locations. For locking and unlocking, users fix their gaze on each object that composes the password, in the correct order. To mitigate bruteforce attacks, users must fix their gaze on each object for at least 1500 ms. The last type, the Hybrid passwords, combine the spatial password approach with the virtual password one, allowing users to define their preferred combination of 3D objects and positions. The authors evaluated the three categories in terms of resilience against shoulder-surfing attacks, showing that spatial passwords are the most vulnerable, followed by hybrid passwords. In contrast, 135 attempts of performing a shoulder-surfing attack did not allow the attackers to infer virtual password.

Jain et al. [jain2004] noted that body movement can be used for authentication purposes when it is universal, distinctive, repeatable, and collectible. Based on such observations, several works in the literature explored how VR sensors can be used to capture movement patterns. Rogers et al. [rogers2015] showed the possibility of identifying users using their unconscious head movement and blinking, while performing a simple passive task. The authors set up an experiment in which they showed 20 participants a rapid series of numbers and letters. Using the collected data, they achieved an identification accuracy of 94%, showing the suitability of unconscious movements for unobtrusive user authentication. Similarly, Li et al. [li2016] noted that adults exposed to fast beat audio tracks exhibit unique patterns of head movement. Their observation resulted in Headbanger, a framework that captures unique head movements, which arise as a natural response to fast-tempo audio stimuli, and authenticates the users accordingly. Implemented using a Google Glass device, Headbanger shows an EER of 4,43% when users are exposed to a 10-second music cue. When tested against imitation attacks, less than 3% of attempted attacks achieved a false positive.

With VRCAuth, Sivasamy et al. [sivasamy2020] also investigated the use of head movements to provide continuous authentication in VR settings. VRCAuth performs a binary classification to determine whether the current user is authorised or not. The authors evaluated five different binary classifiers on two different data sets, the Kaggle VR driving simulator [jafarnejad2017] and the user behaviours in spherical video streaming [wu2017]. Both datasets provide three-dimensional information; additionally, the latter provides quaternion information. The results of the experiments show that the PART and LMT classifiers achieved an accuracy of more than 99% for the first dataset. On the second dataset, every classifier achieved an accuracy of more than 99%, suggesting that quaternion data can significantly simplify binary classification tasks.

Similar to Miller et al. [miller2020personal], Mustafa et al. [mustafa2018] found that body movement patterns (freely interacting within a VR environment) exhibit enough information to identify users. In their work, the authors argued that data of head movements can be used for authentication purposes. To prove their claim, they designed an experiment with 23 subjects that interacted with a VR application developed by the authors. The VR application was designed specifically for evaluating head and body biometrics. In the experiment, the users start from a point A and must move towards a random point B where a ball appears, then to a third random point C, and so on, until the user has reached 25 balls. To move towards a ball, users must move a VR pointer towards the target ball by nudging the VR headset in the desired direction. Each subject was evaluated in two sessions that took place, at least one day apart from each other, and the experiments showed an EER of around 7%, in the best case. Although these results would not suffice to deploy a real-life application, head and body movements hold promise for authentication purposes.

Iv-B Authentication based on eye-tracking

Authentication using eye tracking sensors has been studied long before VR technology took hold. For example, in 1978 Hill [hill1978] patented an apparatus for recognising individuals by the retinal vasculature in their eyes and, in 2004, Maeder et al. [maeder2004] investigated the use of eye gaze to authenticate users. More recently, in 2016 Sluganovic et al. [sluganovic2016] observed that gaze-based authentication systems suffer from high error rates or require long authentication times. Based on this premise, they proposed using reflexive saccades, rapid eye movements that reorient the eye to the next focus object. When the gaze stares at slow moving stimuli, the eye produces a smooth pursuit movement. However, when the speed exceeds a certain threshold, the human eye cannot follow stimuli with smooth pursuits anymore and it must switch to the saccadic movement. The experiments of the authors show that saccades exhibit differences between individuals, enough to be used as biometric information. Other works, such as Rigas et al. [rigas2016] and Holland and Komogortsev [corey2013], show the potential of using saccades for identification purposes.

In this section, we focus on works that discuss the use of eye-tracking sensors in the scope of VR, neglecting on purpose the vast field of eye-based authentication and recognition. Although early VR sets were not equipped by default with eye tracking, these sensors are becoming more and more common in commercial headsets. One of the main benefits provided by VR HMDs to eye tracking is that HMDs fully cover users’ eyes, allowing for controlling lighting conditions and reducing problematic reflexions.

Liebers et al. [liebers2021gaze]

proposed an implicit identification method based on users’ saccadic movements and head orientation. Similarly to when eyes switch from slow pursuit to saccadic movement, as previously mentioned, when the stimulus speed exceeds the saccadic speed threshold, human beings must recruit head movement to keep track of the stimulus. The authors show that the characteristics of fixations and saccades, together with head orientation, are complex and unique enough to be used as biometric features. In particular, with a small pool of 12 participants, using a kNN algorithm on every feature they envisioned, their approach achieves an accuracy of 82%, whereas using deep learning the accuracy reaches the 100% mark.

Luo et al. [luo2020] noted that authentication methods based on head and body movements, such as those mentioned above, expose the authentication actions to bystanders. They also noted that eye-gazing alone can exhibit a high EER, too high for trusting it as an authentication method. Therefore, with OcuLock the authors propose to involve the entire human visual system (HVS), as it is composed of different entities, such as eyelids and extraocular muscles, which exhibit features that could be used for authenticating purposes. Tested with 70 participants, OcuLock achieves an EER of 3,55% against impersonation attacks and 4,97% against statistical attacks. However, it should be noted that standard HMS headsets equipped with eye-tracking cameras cannot measure such HVS signals; for example, they cannot be used to implement a framework based on electrooculography (EOG) such as OcuLock. In fact, to capture EOG signals, the authors had to enrich with tiny electrodes the foam face cover of a Lenovo Mirage Solo VR headset.

With VRPursuits, Khamis et al. [khamis2018] showed that VR equipped with eye tracking is suitable for implementing smooth pursuits that address the Midas touch problem (i.e., the problem of distinguishing deliberate gaze from basic eye perception functions). Although their work does not address user authentication nor security properties, their implementation of a virtual ATM shows the potential feasibility of an authentication method based on smooth pursuits detection. In their proof of concept, participants were asked to enter a 4-digit PIN by intentionally looking at the desired numbers. The numbers, from 0 to 9, were visualised as two groups of cubes rotating clockwise and counterclockwise, respectively.

In their preliminary paper, Lohr et al. [lohr2018, lohr2020]

explicitly address the goal of using eye movement biometrics to authenticate VR users. During the enrolment phase, the prototype extracts four fixation features and eight saccades features, for a total of twelve eye movement features. To perform authentication, the authors propose to compare the features of two different templates (the user currently examined and a stored template), create a vector of match scores, and finally combine the scores into a single fused score using a weighted mean fusion with equal weights.

Zhu et al. [zhu2020] noted that, by design, VR systems prevent their users from seeing what happens around them in the real world. This means that users have a hard time concealing their actions from bystanders, which could lead them to potentially leak sensitive information. BlinKey, an authentication scheme designed for VR headsets equipped with eye-tracking technology, aims to mitigate the problem by using only users’ eyes, naturally covered when using a VR HMD. The core idea of BlinKey is to encode a password as a set of blinks, performed with a rhythm only known to the user (e.g., following a tune that they like). As an additional authentication layer, BlinKey analyses the pupil size variation between blinks, a biological marker that uniquely characterises every person. BlinKey was implemented and tested using a commercially available HTC Vive Pro equipped with a Pupil Labs eye tracker. To evaluate the security resilience of their proposals, the authors tested four different types of attack: zero-effort, statistical, shoulder surfing, and credential-aware. For the first category, the participants acted as attackers which try to guess the target blinking pattern, and the attackers were not able to compromise passwords composed of 7 or more blinks. With statistical attacks, attackers have access to a dataset of blinks of their victim, which they can analyse through statistical analysis. This leads to a slight advantage for the attacker, with respect to the zero-effort attack, but passwords composed of at least 8 blinks seem to prevent any compromise attempt. Credential-aware attacks assume that the attacker knows exactly the password and tries to replicate it with their own blinks (i.e., exhibiting their own pupillary biometrics). The attack proved to be more successful than the previous ones: when the password was composed of at least 10 blinks, the success rate against BlinKey did not exceed 4,4%. However, this shows that pupil variation alone cannot be used as an authentication factor. Last, shoulder-surfing attacks on BlinKey are considerably harder than guessing PINs or observing swiping patterns, as it is challenging for the attacker to guess blinks when the target’s eyes are covered by the VR headset itself.

Iv-C Biomechanics-based Authentication

As pointed out by Miller et al. [miller2021], traditional authentication methods, such as PINs and passwords, not only can be stolen by malicious attackers, but they also allow a legitimate user to hand-over control to a third party. This can be particularly problematic for VR applications that must ensure that the user performing the tasks is the same user who authenticated themselves in the first place. For example, a VR scene used to undertake university exams should prevent a student from cheating by logging in and handing over the control to a fellow student. One solution for countering handover threats is to implement implicit and continuous authentication via biometrics, which allow for detecting automatically when the user’s characteristics differ from the expected ones. Another downside of explicit authentication, as observed by Liebers et al. [liebers2021gaze], is that it negatively affects user immersion in VR. Biometrics-based authentication techniques also have an advantage over explicit authentication methods in this regard, as they are implicit and provide a smoother experience for the users.

Beyond the information collected from headsets, VR equipment can provide more interesting data for authenticating purposes. In 2019, Pfeuffer et al. [pfeuffer2019] investigated the use of body motion and behavioural biometrics for identifying users in a VR setting. They conducted their experiments using an HTC Vive (composed of a HMD, two controllers, and an optical tracking), enriched with an additional eye tracker from Pupil Labs. First, they identified four basic tasks that can be found in most VR applications: pointing, grasping, walking, and typing. Then, they designed specific tasks that revolve around these four activities and invited 22 participants to perform them. On the basis of the collected data, the authors evaluated different motions and distance relations (e.g., the distance between the headset and the joysticks). However, their best result, obtained using a combination of head motion and the distance between every device, achieved only an average of 40% accuracy. Therefore, further investigations were warranted.

Other works investigated the combination of data from controllers and HMDs for achieving authentication on VR. Several of these works used a virtual ball-throwing task as a case study. In the first instance, Kupin et al. [kupin2019] used as a feature the position of the dominant hand holding a VR controller, following the intuition that users exhibit identifiable trajectories due to their unique biomechanics. In their experiment, the authors instructed users to pick up a virtual ball and throw it at a target and they compared the trajectory of the dominating hand of each user against a dataset containing the movements of the other users. Using bounding box centering and a nearest-neighbour (NN) matching algorithm on 95 different 3D trajectories, the authentication accuracy reached 92,86%, proving that trajectories exhibit enough peculiarities to identify a user.

Ajit et al. [ajit2019] improved the approach proposed by Kupin et al. [kupin2019]

, extending the data analysis to the recessive hand and the head and using the device orientation in addition to the position. In an experiment involving 33 users, the authors improved the results of Kupin et al., obtaining an authentication accuracy of 93,03%. Their approach is based on a perceptron classifier which learns weights on the matches between position and orientation features on the hand controllers’ trajectories, as well as the headset trajectory. In practical terms, trajectories belonging to the same user result in a low classifier score and allow the user to authenticate.

Liebers et al. [liebers2021] further investigated the effect of body normalisation on user identification in VR. In particular, they analysed the impact of body height normalisation, arm length normalisation, and the combination of both against scenarios where no normalisation is applied. They evaluated a small pool of 16 participants during the execution of two tasks that can be correlated to actions executed in a typical VR game: a bowling throw task and an arrow shooting task. The research highlights that, while physiological factors play a clear role in behavioural biometrics, they also increase the noise injected into deep learning algorithms. Removing or reducing the features showed a positive impact on the identification accuracy of the deep learning models implemented by the authors, i.e. RNN and MLP. Not only this work suggests that it is possible to achieve user identification without processing some sensitive information, but also that it is feasible to perform continuous authentication during regular gaming sessions.

Miller et al. [miller2020] noted that authentication approaches based on behavioural biometrics assume that users enrol in the same VR system that they will later use. However, in the future users might interact with many different VR headsets in their daily tasks. This exposes a problem, because different VRs exhibit system-specific biases, meaning that every time that users want to use a new headset, they have to engage in new enrolment phases. Therefore, Miller et al. investigated the differences between 3 different VR systems (Oculus Quest, HTC Vive, and HTC Vive Cosmos), collecting the biometric data of 46 users performing a simple task on each system. They instructed the participants to perform a simple (virtual) ball-throwing task on each headset, repeated 10 times, for several consecutive days. Expanding on what has previously been done by Ajit et al. [ajit2019], they introduced the trigger control information from the dominant controller, as well as the linear and angular velocity to account for the differences in speed between users. The results in terms of within-system and cross-system accuracy showed that some features play a prominent role for authentication purposes. In particular, head orientation, left controller position, right controller orientation, and right controller position proved to be the most important features for authenticating purposes. Their results confirm the intuition of Ajit et al. [ajit2019], about the relative importance of different features when performing authentication.

Improving their previous work, Miller et al. [miller2021]

proposed using siamese neural networks to learn the baseline distance function across HMDs. Knowing this information, when a user wants to use a new VR headset, it suffices to apply the distance function to offset the inherent differences with the original headset used for enrolment. This prevents users from having to go through an additional enrolment phase every time they use new VR equipment. In this paper, they used the same experimental setting they used in their previous study 

[miller2020]. Regarding user authentication, their results show that the siamese network achieves at its best a 1,39% EER, when the Oculus Quest trajectories are used at enrolment time and the HTC Vive at use time. For user identification, their highest accuracy is 98,53%, with the same setup observed for the EER authentication result.

Olade et al. [olade2020biomove] proposed BioMove, a method for recognising biomechanics behaviour patterns and using them to authenticate users. In particular, the authors studied the movements of the head, hands, and eyes of 15 users while performing controlled tasks, such as grabbing, rotating, and dropping. The authors investigated the use of BioMove for the purpose of both user identification and authorization. In the best case scenario, for identification tasks BioMove achieved a classification accuracy 98,6%. Regarding authentication, the authors’ experiments show a 0,0032% FPR and a 1,3% FNR, highlighting that BioMove (and behavioural biometrics, in general) could be used as an efficient second-factor authentication.

Shen et al. [shen2018] note that VR systems often store users’ private data, such as social network accounts and credit card information. Although convenient, this way of accessing information exposes significant security and privacy issues. For amending this issue, the authors proposed an authentication system based on the users’ gait signatures, named GaitLock, that implements a new gait recognition model, Dynamic-SRC. Dynamic-SRC fuses data information from both the HMD accelerometer and the HMD gyroscope to address the problem of noisy signals coming from head motions. The validation experiments consisted of different walking tasks; among these tasks, the most interesting is the uncontrolled outdoor walking task. The results show that Dynamic-SRC outperforms by 20% the gait recognition accuracy of the best alternative algorithm (SRC with Sparse Fusion). Furthermore, in the uncontrolled walking task, GaitLock achieves an accuracy of 95% using just a small dataset of 5 steps. Although the authors used only an AR device that could provide enough computational power (i.e., Google Glasses), they noted that with enough computational power, their system can be transposed to VR.

Iv-D Other works in VR Authentication

Most papers in the field of VR authentication, at the time of writing, leverage either something users know, such as a password, or some personal characteristics, such as their behavioural biometrics. Combining these different approaches for achieving multi-layered authentication has the potential to produce stronger authentication schemes across the board.

For example, with RubikBiom, Mathis et al. [mathis2020biometrics] investigated the use of knowledge-driven behavioural authentication for authentication in the VR domain. The authors developed a proof of concept that leverages two different authentication features. A password only known to the user and the movement patterns that arise from the act of inputting such a password in VR. As for previous proposals for behavioural biometrics, the intuition is that every user exhibits peculiar movement patterns due to their specific biomechanics. In particular, an attacker that attempts to infiltrate a victim’s account by shoulder-surfing, not only should steal the password, but should also be able to replicate the same movement patterns when inserting the password. In RubikBiom, the password is a 4-digit PIN selected on a 5-faced cube. Each face has a different colour and exhibits 9 digits (1-9), similarly to the setup used by the same authors in their works previously discussed [mathis2020, mathis2021]. The best accuracy score was 98,91%, which demonstrates that multilayered approaches in VR hold promise. The result was obtained with a fully convolutional network (FCN) over a multivariate dataset (composed of dominant hand position and rotation), together with non-dominant hand position and rotation.

In a preliminary study, Wang and Gao [wang2021] show that multiattribute authentication methods could also be effective in countering Man-in-the-Room (MITR) attacks. The basic intuition behind their authentication scheme is that objects in the real world are defined by a rich set of features (e.g., colour and shape). In their proof-of-concept, the user chooses a number of secret features at enrolment time and, at authentication time, the framework creates random objects which might contain none, some, or every feature. To log into the system, the user must stare at the object that exhibits all the desired features. This approach is particularly interesting because it allows to effectively decouple the input process from the password, increasing the resilience to MITR attacks.

Von Willich et al. [vonwillich2019] discussed the danger that passersby create to VR users, unaware of their surroundings. HMDs impede visual perception of the surrounding environment, allowing bystanders to physically collide with VR users or observe their actions to infer their authentication credentials. As a mitigation, the authors propose to detect passersby using environmental sensors and depict them in the VR environment, so that the VR user is aware of them, effectively implementing augmented virtuality (AV), a paradigm that involves enriching the virtual world with information from the real world. This depiction can take different forms, such as avatars, 2D pictures, or 3D renderings. While AV was originally meant for avoiding unwanted collisions with dynamic agents in the real world, the authors argue that it can be used for preventing some by-standing attacks. Although AV cannot impede attackers from observing from a distance, it could be used to detect attackers that get close enough to be picked by the sensors and prevent them from having a proper close-up view of VR users’ motion, thus reducing their chances of inferring useful information.

V VR for teaching, training, and evaluating Security

VR environments have been proposed as a mean for improving the cyber and physical security of sensitive locations, as well as for training employees on routines and policies. In addition, VR has been proposed to assess the usability of authentication processes. In this section, we provide a brief overview of these different applications.

V-a Teaching Cyber-Security

In 2017, Puttawong et al. [puttawong2017] noted that the abstract nature of network security can make it hard to grasp for students, when taught in traditional lecture-based classes. Therefore, the authors proposed VRFiWall, a VR edutainment game developed with Unity for teaching firewall security. In this game, users interact with the environment using two basic actions, that is, pointing and head gestures. This simple game, which instructs its users about stateless firewalls, revolves around a hero on a secret mission from his kingdom (IP source) to another kingdom (IP destination). To complete his mission, the hero has to match the requirements of Statelessa, an non-playing character that embodies the firewall.

Following the same philosophy, Visoottiviseth et al. [visoottiviseth2018] proposed Lord of Secure, a serious game for VR that challenges the users with different cybersecurity concepts. The game is split into three chapters, which present various topics: IP spoofing, flooding, TCP covert channels, firewalls, intrusion detection and prevention systems (IDSes and IPSes), and honeypots. The goal of Lord of Secure is to make abstract security topics easier to digest for cybersecurity students and to evaluate the users’ understanding of core concepts by means of pre-test and post-test quizzes. Among a group of 28 participants, 90% claimed that they well understood the concepts covered in the game, and 82% thought that this approach allowed them to understand better than traditional frontal lessons. Indeed, quizzes that included pre-test and post-test clearly showed that the users improved their knowledge by playing the game. On average, considering Chapters 1 and 3, 70% of the users improved their scores after playing the game. Taking into account users’ enjoyment and their improvements, the authors concluded that edutainment games might be an alternative and more effective way to teach cybersecurity.

Recently, Ulsamer et al. [ulsamer2021] discussed the use of storytelling in VR to improve user awareness of information security (ISA), in particular, with respect to social engineering. In their experiments, they split the participants into two groups, providing one of the groups with a VR environment and the other (named no-VR) with a traditional e-learning platform. The learning objectives set for the two groups were the same. The no-VR group received theoretical material and interesting examples (but no videos) related to social engineering. The VR group, instead, watched a storytelling 3D video with a coherent and immersive plot that revolves around a hacker which performs social engineering; the hacker is played by an actor and the user follows his steps throughout the story. The ISA analysis show that the VR group achieved better scores in the ISA test, supporting the hypothesis that VR-based learning can improve cybersecurity awareness and material retention. Since that the no-VR group was provided with static material but no traditional videos, it is still unknown whether regular videos could achieve comparable results to VR-based videos.

V-B Training Physical Security

Henrique da Silva et al. [henriquedasilva2015] proposed a VR-based tool to train staff in a nuclear facility structure and plan its defence strategies. The authors produced (and used) a high-fidelity 3D reproduction of a Brazilian nuclear research centre, showing that their approach can be useful for evaluating the physical security of different facilities. In particular, they proved that inside the VR simulation it is possible to identify strategic points of view and change the environment to evaluate the impact on visibility. For example, their VR implementation allowed them to change weather conditions, such as rain, wind, and snow, as well as natural light conditions (day and night alternation, sun position, presence of stars) and artificial illumination. In conclusion, VR environments can be useful tools, both for evaluating the physical security of critical facilities and for training employees in risk management.

V-C Evaluating Cyber-Security Usability

Mathis et al. [mathis2021repli, mathis2021virsec] noted that evaluating the usability of new authentication schemes can be expensive, in particular for the cases when special hardware and infrastructures are required. VR can be helpful in addressing this problem. First, it saves researchers time and money by sparing them from building their physical prototypes (e.g., a replica of an ATM machine). Moreover, it enables them to recruit more participants and increase their pools’ diversity, by allowing to enrol people from different parts of the world.

With RepliCueAuth, Mathis et al. [mathis2021repli] paved the way to usability studies in VR, discerning which security assessments could be transferred from VR to the real world and which ones not. In their work, the authors replicated in VR the experiments of a paper that explored 3 cue-based authentication methods on situated displays. The results showed that some usability studies can be transferred from VR to the real world, such as the accuracy of authentication entry, the perceived workload, and the perception of security. However, there are some notable differences between the VR experience and the physical study, in terms of results. Users took longer time to authenticate on VR with the touch-based method, while they in the real world the gazing-based approach was the slowest one. Similarly, the security studies conducted on VR and on real world experiments lead to comparable results for some evaluation variables, but results differed in other instances. For example, the attack rates for shoulder-surfing attacks were similar when observing a real human being and a VR avatar, but the accuracy of the attackers’ guesses was different.

Vi Conclusion

Once a technology only used in exhibitions and research laboratories, VR is becoming more and more common among local businesses and families. Therefore, it is of paramount importance to take into consideration the issues that this groundbreaking technology can create. In this work, we have provided a thorough analysis of the privacy and security threats that affect VR. First, we split the privacy issues in three main categories, explaining what could cause these issues and what might be the consequences. Then, we focused on the security aspect of VR. We categorised the threats to security in VR, including both the generic threats that affect VR and the threats that are specific to VR. We have also dedicated a specific section to the topic of authentication in VR, as it appears to be the most common area of research in the field of VR cybersecurity, at the time of writing. Finally, we have covered other interesting areas of research that use VR for cybersecurity goals, such as teaching cybersecurity, training physical security, and evaluating the usability of cybersecurity solutions.