Minimizing Latency to Support VR Social Interactions over Wireless Cellular Systems via Bandwidth Allocation

02/09/2018 ∙ by Jihong Park, et al. ∙ University of Oulu King's College London Aalborg University 0

Immersive social interactions of mobile users are soon to be enabled within a virtual space, by means of virtual reality (VR) technologies and wireless cellular systems. In a VR mobile social network, the states of all interacting users should be updated synchronously and with low latency via two-way communications with edge computing servers. The resulting end-to-end latency depends on the relationship between the virtual and physical locations of the wireless VR users and of the edge servers. In this work, the problem of analyzing and optimizing the end-to-end latency is investigated for a simple network topology, yielding important insights into the interplay between physical and virtual geometries.



There are no comments yet.


page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Virtual reality (VR) is a key use case for 5G [ABIQualcommVR:17, EjderVR:17, MohammedWCNC:18, NokiaVR:18]. Its emergence is powered by the recent advances in computing, which enable immersive real-time interactions with virtual objects. As announced by Facebook [FBspaces:17] and Microsoft [MicrosoftMR], users will soon be able to interact with each other within virtual communities using VR technologies. In this paper, we consider the problem of supporting VR-based mobile social networks over cellular systems by means of edge computing [ABIQualcommVR:17, EjderVR:17, MohammedWCNC:18, NokiaVR:18].

A key new element of this challenging problem is the discrepancy between virtual and physical locations of the participating users. In fact, traffic is generated by VR communities in a virtual space, but the supporting network resource for communication and computation are located within the physical network infrastructure. Therefore, the users in the same VR community may not always be co-located in the physical space. For example in Fig. 1, user C belonging to VR community 1 is close in the physical space to user D affiliated to VR community 2, but far from users A and B in VR community 1. This spatial difference between virtual and physical topologies affects the operation of resource allocation and transmission techniques over both Radio Access Network (RAN) and backhaul.

To elaborate, consider VR mobile users that interact in a virtual space. In order for these interactions to be perceived as natural, the network needs to guarantee low latency of e.g.  ms for tactile interactions [ABIQualcommVR:17]. At the same time, all user states should be properly synchronized in the shared virtual environment. Each user’s end-to-end latency is thus dominated by the user from the VR community that experiences the worst latency, accrued due to communication and processing. This, in turn, depends on the physical distribution of users belonging to the same VR community and on the spatial availability of communication and computation resources within the physical network infrastructure.

In this work, we study the problem of supporting a VR mobile social network over a multi-cell wireless cellular system with the goal of minimizing the end-to-end latency. Specifically, we focus on the problem of minimizing the end-to-end latency via the bandwidth allocation of the uplink and downlink channels used for communication between users and computing servers. To this end, we formulate a simple model based on a linear cellular topology that captures the interplay between the social interactions within the VR mobile social network and the location of the computation and communication resources within the physical network. The average end-to-end latency is evaluated by accounting for the contributions of uplink, downlink, and backhaul transmissions, as well as for processing times at the servers. The resulting latency is minimized through a stochastic optimization technique.

Fig. 1: Illustration of VR mobile social network, where the traffic is generated by virtual-space user interactions and supported by a physical cellular network.

Related Works – Current VR headsets provide wireless connections via WiFi and/or WiGig (60 GHz) technologies using unlicensed frequency bands [Displaylink]. The resulting short-range barrier can be overcome by enabling 5G wireless connections. For such 5G-enabled VR headsets, computing tasks will be conceivably offloaded to edge-cloud servers, in order to overcome the restrictions brought by the limited computing capability and battery capacity of mobile devices.

The required wireless capacity needed to support immersive VR experiences has recently been investigated in [EjderVR:17]. To minimize the VR traffic volume, a caching approach has been proposed in [MohammedWCNC:18]. In a VR theater scenario, a multicast design has been studied in [NokiaVR:18]. These works [EjderVR:17, MohammedWCNC:18, NokiaVR:18] focus only on optimizing the downlink operations. For augmented reality (AR) applications, the optimization of both uplink and downlink transmissions in terms of end-to-end latency has been studied in [OsvaldoAR:17] for a single-cell scenario. Due to the focus on AR, end-to-end latency model of reference [OsvaldoAR:17] does not take into account virtual social interactions. Finally, virtual social interactions underlie Massively Multiplayer Online game (MMO) applications such as Second Life [SecondLife]. Within more restricted virtual spaces, immersive VR social interactions have been recently provisioned by Facebook and Microsoft [FBspaces:17, MicrosoftMR].

Fig. 2: An illustration of a virtual space traffic flows in a one-dimensional physical network model. In the virtual space, VR users A, B, and C interact with each other within VR community . Their uplink updates , , and are sent to the cloud computing server at via unicast transmissions. The resulting downlink update needs to be sent to all three users for synchronous interactions via multicast transmissions


Ii System Model

This section describes the physical network infrastructure, including RAN, backhaul, and computation resources, as well as the VR data traffic model. To focus on the key ideas, we consider a one-dimensional physical network model with two VR communities, as well as two base stations (BSs) in the physical space, as illustrated in Fig. 2.

We use the subscript to indicate VR communities 1 or 2. The subscript identifies the two BSs. The superscript describes uplink (up) or downlink (dn) operations at a BS.

Ii-a Physical Network and Channel Model

The network under study comprises a set of users and two BSs and . Each BS is equipped with a computing server that supports a single VR community. The computing server for VR community is located at BS , unless otherwise specified. In the virtual space, each VR community includes a subset of users , representing a fraction of the set of users, with . Furthermore, a subset of users associates with BS for both uplink and downlink transmissions in the physical space, representing a fraction , with . There are four types of possible user assignments in the virtual and physical spaces, partitioning the set of users into four subsets that are defined as with and . Each type includes users, representing a fraction . Note that we have the equalities , and .

The two BSs are located at the edges of a one-dimensional physical space with length , and are connected by a wired backhaul. The BSs use disjoint spectrum bands, hence not interfering with each other. BS assigns orthogonal bands to uplink and to downlink following Frequency Division Duplex (FDD). In the uplink, BS serves each of the associated users via Frequency Division Multiple Access (FDMA) using unicast transmissions. In the downlink, instead, each BS uses orthogonal multicast transmissions in order to update the users in the two VR communities. For a given user configuration , we denote by the bandwidth allocated in the uplink to each user from the subset and by the bandwidth allocated for multicasting to all users in . The users associated with any BS are located at a distance as illustrated in Fig. 2. For the purpose of obtaining worst-case performance results, all users are assumed to lie at the maximum distance . Extensions of the analysis to the more general scenario of arbitrary user-BS distances that are upper bounded by are possible, but call for more cumbersome notation.

Users in the subsets with , denoted as cross-type users, are in VR community , but are associated with BS . The uplink data of these cross-type users must be forwarded through a wired backhaul to BS in order to be processed by its attached server. As defined below, each backhaul transmission entails a random delay with the average value proportional to the distance and the data size.

For the given physical distance between a BS and the assigned user, the signal-to-noise ratio () is determined by path loss attenuation for and by independent Rayleigh fading. Therefore, the in uplink or downlink for a given user is


where denotes transmit power and

indicates the noise variance. The term

represents a small-scale fading coefficient that follows an exponential distribution with unitary mean. These coefficients are independently and identically distributed (i.i.d.) across users in uplink and downlink. We assume the use of type-I Hybrid Automatic Repeat reQuest (HARQ), while the instantaneous

information is unknown at the BSs.

Ii-B Virtual Space Traffic

In the virtual space, all the VR users in community are assumed to interact with each other, directly or indirectly, as seen in Fig. 2. In order to enable these virtual interactions, each user uploads its uplink state update message with size at regular time intervals, and all the users download the common downlink update message with size . We hereafter consider a fixed users’ allocation in physical and virtual spaces given by . With this given user configuration , we focus on a reference user . This user is uniformly randomly selected in the set of users, and thus has a type

with probability

. For a single state update, the VR traffic of user is characterized by the following phases.

  • Step 1 (Upload) – The user uploads its update message with size bits to the associated BS . If user is of cross-type, i.e. , the uplink data is forwarded to the desired computing server at through the inter-BS wired backhaul;

  • Step 2 (Compute) – The computing server at BS collects all input data from user as well as from its interacting users, and then produces their synchronous output states;

  • Step 3 (Download) – The output states are updated with a common message of size bits to all users through wireless and, for cross-type users, backhaul links.

Note that, in order to carry out Step 2, the computing server needs to collect data from all users in VR community . For this reason, the delay prior to computing is limited by the user with the worst uploading delay in VR community , as described next.

Ii-C Physical Space Delay

In this section, we fix the user configuration and the spectrum allocation with and for , and analyze the latency of a reference user with a fixed type . According to the described VR input/output data flow, conditioned on , , and the reference user’s type, the average end-to-end latency of user consists of the average uploading delay , computing delay , and downloading delay as in


We now discuss the three terms in (2). First, the average uploading delay is, as discussed, the worst user’s uploading delay for the users in VR community . Denoting by and the instantaneous uplink wireless and backhaul delays for any user , the average uploading delay is given as


In (3), the expectation is taken over the random number of transmission time slots required by the HARQ process as well as the random backhaul delay, as detailed next.

The uplink wireless delay in (3) depends on the instantaneous uplink s, which are random due to the small-scale fading coefficients in (1). Specifically, if the instantaneous is no smaller than a target threshold , the received signal is successfully decoded; otherwise, a retransmission occurs. For a target success probability , such that , the threshold is given as


As a result, the number of transmission attempts by the

-th user follows a geometric distribution with mean

. Measuring the achievable rate via Shannon capacity, each transmission lasts for seconds, where the uplink spectrum allocation equals if user . The total uplink wireless transmission delay of the user is hence given as