I Introduction
With the rapid advancements of embedded systems, sensors, and artificial intelligence in recent decades, autonomous driving has evolved from an impossible dream into a foreseeable reality, and will eventually reshape future transport system and driving experience
[10]. As the key enabling technology, cooperative perception[9] exploits vehicletoeverything (V2X) communications to exchange sensory messages among neighboring vehicles and road side units, and offers an eagle view of broader road environment. In[8], a remarkable perception performance gain over traditional separate perception is validated through theoretical analysis, regardless of the additional communication and computation overhead. Nevertheless, when the number of involved vehicles is , the required communication capacity reaches Mbps[6, 2], which overwhelms limited communication bandwidth and computation power in vehicular networks (VNETs) and restricts the implementation of cooperative perception.To balance the exchanging message volume in cooperative perception and the limited resources in VNETs, some efforts have been made for the joint allocation of sensing, communication, and computation resource. Higuchi et al.[4] attempt to reduce the redundancy of cooperative perception messages by selecting more valuable ones to be delivered. Specifically, the value of messages is firstly quantified based on message history and current mobility, and then more valuable messages are transmitted via V2X communications. In[1], a quadtreebased compression mechanism is utilized to partition sensory information into different block messages. Then, the vehicular association, resource allocation, and block message selection are jointly optimized with deep reinforcement learning (DRL) to maximize the satisfaction of cooperative perception. Du et al.[3] go a step further by proposing a platoonbased cooperative perception framework, in which the perception scheduling, computation strategies, and communication resource allocation are jointly optimized to minimize the delay of sensing tasks.
The aforementioned works[4, 1, 3] commonly concentrate on the direct sharing among vehicles, leaving the more general problem of VNETassisted cooperative perception less investigated. By taking advantages of hierarchical computation resources and flexible resource management, fogbased vehicular networks (FVNETs)[11] provide a promising edgecloud collaboration framework. In this paper, we examine how to realize cooperative perception with constrained resources in FVNETs, where multiple vehicles offload individual sensory messages to fog access points (FAPs) or the cloud server for cooperative computation. The main contribution are summarized threefolds.

Taking spatialtemporal values of sensory messages and latency performance into consideration, we propose a joint sensing block message, communication and computation resource allocation problem to maximize the sum satisfaction of cooperative perception, while satisfying the maximum tolerant latency and sojourn time constraints.

The nonconvex sum satisfaction maximization problem is decoupled into a joint block message and offloading mode selection problem plus a joint communication and computation resource allocation problem. By solving two problems based on multiagent DRL and swap matching, the sum satisfaction can be maximized.

Through simulation results, we show that our proposed scheme can significantly improve the sum satisfaction of cooperative perception in FVNETs.
Ii System Model and Problem Formulation
Considering a cooperative perception scenario in FVNETs, which consists of vehicular user equipments (VUEs), FAPs, and remote radio heads (RRHs) connecting to a centralized cloud server. As shown in Fig.1, each VUE adopts region quadtree technique [1] to compress its sensory data into independent block messages . Here, and denote the block sets of VUE
’s sensory region and the whole region, respectively. With efficient machine learning methods, the cloud server can implicitly estimate VUE
’s trajectory and sojourn time with FAP for time , where are coordinates.For cooperative perception, each VUE requests a map constructing task which covers its front region with a length of . To this end, the cloud server chooses different VUEs to constitute independent cooperative clusters, and selects different block messages from cooperative VUEs for uploading. In this article, we assume that the VUEs associating with the same FAP or the cloud server form a cooperative cluster, and partial messages could be migrated among FAPs and the cloud server via fiber backhaul links.
Once receiving these uploading messages, the cloud server and FAPs carry out customized maps for individual VUEs by sequentially realigning, integrating, inferring, and projecting the collected uploading messages. We define a tuple to refer to VUE ’s task, whose elements denote the data size of a block message, the number of CPUcycle frequencies required to process 1 bit input, and ’s maximum tolerant latency, respectively.
Iia SpatialTemporal Model in Cooperative Perception
To identify the values of block messages in cooperative perception, the spatialtemporal model is detailed as follows.
From the perspective of timeliness, the value of VUE ’s block message decreases from time of generation at until it hits deadline at . A linear descend function is used to define the temporal value of VUE ’s block message at time as
(1) 
where is the message’s initial value. However, considering that the sensing time and the sensing period are unknown at the cloud server, we model the temporal value by Markov process to characterize its randomness. For VUE , the set of potential temporal value is defined as .
is the probability that the temporal value of VUE
’s block message at time is if its value in time is .As for spatial dimension, each VUE has a rectangular region of interest where the closer sensory blocks are more expected by the VUEs. With VUE ’s and block ’s positions , , the euclidean distance between VUE and block and the angle between VUE ’s moving direction and block can be easily calculated, as illustrated in Fig.1. Thus, we define the spatial value of block for VUE as
(2) 
Finally, the spatialtemporal value of VUE ’s block message for VUE can be calculated by .
IiB Communication and Computation Models
As illustrated above, includes block message uploading, joint processing, and output downloading process. Like[3], we ignore the download latency. Denote and as the uploading and computation latencies, we have ’s sum latency . Moreover, each task could be handled by following two possible modes: the cloud mode and the FAP mode. Let denote the offloading mode selection of VUE : if , VUE choose the cloud mode; while , VUE offloads the task to FAP . Next, we describe the communication and computation models considered in both modes in sequence.
IiB1 Communication model
In FVNETs, the total bandwidth is divided into resource blocks (RBs) with bandwidth each and full frequency reuse is considered for all FAPs and RRHs. Define as the RB allocation indicator. if RB is allocated to VUE and , otherwise. Without loss of generality, each VUE is allocated a single RB for uploading. For VUE associating with FAP , its uploading rate can be expressed as
(3) 
where , , and are the transmit power, noise power, and channel gain from VUE to FAP on RB at time , respectively.
When VUE selects the cloud mode, it associates with its close RRHs, whose set is described as . In the meantime, optimal linear detection, i.e. minimum mean square error (MMSE) detection, is performed at the cloud server to mitigate interRRH interference, thus the uplink rate of VUE in the cloud mode at time is
(4) 
where is the interference and
is the MMSE detection vector.
is the channel gain from VUE to its associated RRHs on RB at time , while is the channel from VUE to the associated RRHs of VUE .Note that the offloaded task starts computing only when the whole VUEs finish uploading. Therefore, the communication latency of task relies on the maximum one among cooperative VUEs, i.e.,
(5) 
where is fronthaul delay. denotes the block message selection of VUE : if , then block message of VUE is selected for uploading.
IiB2 Computation model
For cooperative perception, the offloaded task has to process all uploaded block messages within its region of interest. Assuming that the computation capabilities of the cloud server and each FAP are characterized by the maximum CPUcycle frequency and . With CPUcycle frequency allocated to VUE , the computation latency of task can be obtained by
(6) 
IiC Problem Formulation
In FVNETs, the cloud server and FAPs are interested in finishing VUEs’ offloaded tasks with more valuable block messages as quickly as possible, therefore we define the satisfaction of VUE at time as
(7) 
where and are weight parameters. The first part shows the overall spatialtemporal values of block messages from cooperative VUEs, and the second part denotes the effect of task latency.
The key of this work is to maximize the longterm sum satisfaction for cooperative perception in FVNETs by optimizing the offloading mode selection scheme , the sensory block message selection scheme , the uplink RB allocation scheme , and the frequency resource allocation scheme . Mathematically, it can be formulated as
(8)  
where constraints (8a) shows the overall latency of individual VUE should be less than the maximum tolerant latency of and its sojourn time , constraint (8b) regulates that the uploading block messages of different VUEs are nonoverlapped, constraints (8c,d) means each VUE can be allocated to a single mode and RB, and constraint (8e) denotes that the allocated CPUcycle frequencies are not allowed to exceed the frequency budgets .
Iii Joint Sensing, Communication, and Computation Resource Allocation
Note that the formulated problem (8) is a mixedinteger nonlinear programming problem which is in general intractable. In this section, we decouple the original problem into a joint mode selection and block selection subproblem plus a joint communication and computation resource allocation subproblem. Afterwards, multiagent DRL and swap matchingbased algorithms are developed.
Iiia Optimization of Offloading Mode and Block Selection
Under the circumstance that communication and computation resource allocation scheme has been determined, the original problem can be reformulated as
(9)  
Since the block message’s value is dynamic and unknown, we cannot solve problem (9) using traditional optimization method. Inspired by[12], we resort to multiagent DRL to resolve this uncertainty.
IiiA1 MultiAgent Markov Game
To be concrete, we treat every VUE as an intelligent agent and convert problem (9) into a Markov game with agents. At time , each agent observes local state and selects its own action . Given the joint actions taken by agents, the FVNET feeds back new state and immediate reward . The formal definitions of three elements are given in the following.
State: Each agent ’s local state includes the following parts: , the maximum tolerant latency of VUE ; , the coordinate of its own; , the temporal value of VUE ’s sensory messages at time ; , the number of serving VUEs of the cloud server and FAPs at time ; and , the latency satisfied ratio.
Action: Consistent with (9), each agent selects the offloading mode and uploading messages .
Reward: We consider this game as a fully cooperative one and define the immediate reward function of each VUE as the sum satisfaction, i.e., .
IiiA2 Attention MultiAgent Deep Deterministic Policy Gradient (DDPG)based Algorithm
Due to nonstationary local states and large action spaces, canonical DRL algorithms, like deep Qnetwork, are always ineffective. Inspired by[5, 7], multiagent DDPG algorithm is tuned with an attention mechanism in this article to tackle this problem, namely attention multiagent DDPG.
In attention multiagent DDPG, each agent has its own actor network and critic network , acting as the policy and the policy evaluator, respectively. Note that instead of the locals, the global states and actions are considered at the critic network to address the nonstationary and cooperative issues. Furthermore, two target networks ( and ) and experience replay are applied to stabilize training and to remove data correlation. We denote as the replay buffer with capacity and express the stored experience of all agents as the tuple . Here, we omit the time index and denote mark as time for simplicity.
Specifically, the actor network is responsible for finding a deterministic policy to maximize the cumulative discounted reward , where is the discounted factor. Then, can be obtained by
(10) 
where is the stochastic noise to encourage exploration. For its updating, the parameter is directly adjusted in the direction of , which is given by
(11) 
Here, is the actionvalue function that is established by the critic network . It takes as the input of the states and actions of all agents and outputs the Qvalue for agent . With , the weights of agent ’s critic network
can be updated by minimizing the MSEbased loss function
, i.e.,(12) 
where . Here, and are agent ’s target critic and actor networks with weights and , respectively.
Furthermore, we adjust the critic network with an attention mechanism which facilitates finegrained and discriminatory treatment of different VUEs. To this end, we firstly customize a multilayer perception (MLP) network to reduce the input dimension and extract the higher features . Then, agent ’s attention value for the other agents can be calculated by . Here, and denote agent ’s attention weight and value for agent , which is given by
(13) 
(14) 
Wherein, the parameter matrix and constitute a bilinear mapping for and . In addition,
is a ReLu function and
is a transform matrix. With the derived high feature and the attention value , we could establish a new MLP network to replace the original critic network .IiiB Optimization of RB and CPUfrequency Allocation
Once the offloading modes and selected blocks have been determined, the original problem is reduced into
(15)  
Hereafter, the RBs and frequencies are successively allocated.
IiiB1 Swap Matchingbased RB Allocation
With fixed CPUfrequencies, we firstly model the RB allocation problem as a onetoone matching game and solve it in a distributed and lowcomplexity manner.
Formally, we define as the matching function for the associated VUE set and the bandwidth set , which has the following properties
(16)  
where condition 1) implies that if RB matches with VUE , VUE also matches with RB ; condition 2) gives that each RB can be matched with one VUE, and each VUE can only be matched with one RB in turn.
The utility of VUE is defined as the uploading rate on RB , i.e.,
(17) 
which indicates that every VUE selfishly favours the RBs with higher offloading rate. As for RB , it aims to minimize the overall communication latency which equals to the maximum one, thus we define the utility of RB as
(18) 
Based on above analysis, a swap matchingbased RB allocation algorithm is developed in Algorithm 2. To start the matching process, each VUE and RB establish their own preference lists and in the descending order. A deferred acceptance algorithm is then adopted for initial matching. To overcome the dynamics introduced by intercell interference, two matched VUEs could exchange their matched RBs when the maximum communication latency could be reduced.
IiiB2 CPUCycle Frequency Allocation
Given , the problem (15) is apparently a convex optimization problem with respect to and is easy to be solved by the interiorpoint method.
Iv Simulation Results and Analysis
In this section, simulation results are provided to verify the sum satisfaction of cooperative perception in FVNETs. We deploy RRHs, FAP, and VUEs on a meterlong road. The whole road is equally divided into blocks and the size for each block message is kbits with level region quadtree. For cooperative perception, each VUE requests a task to expand its sensory range by m with and ms. In addition, The total bandwidth is MHz, while the cloud server and the FAP has a maximum computation resource of GHz and GHz, respectively.
Fig. 2 shows the convergence performance of the proposed attention multiagent DDPGbased algorithm, when the number of VUEs is . It is seen that our proposed algorithm could effectively converge to stable sum satisfaction.
In Fig. 3, the sum satisfaction performances are provided with different desired expanding distances. For comparison, a distancebased mode selection and full message uploading scheme is adopted as the baseline. It can be observed that our proposed attention multiagent DDPG algorithm outperforms the baseline, because it could select more valuable block messages and avoid undesired message uploading, meanwhile balancing the load among FAPs and the cloud server. Especially when the network load is large (), our proposed algorithm can effectively balance task load and constrained resource by controlling the selection of block messages.
Fig. 4 verifies the convergence of algorithm 2 with different numbers of VUEs, where the optimal latency which is derived by exhaustive search is adopted for comparison. It can be seen that, our proposed swap matching algorithm can converge to a stationary point within iterations, while its latency performance is close to exhaustive search algorithm.
Fig. 5 evaluates the average latency performances with different communication RB allocation schemes, when the numbers of VUEs are set as . For comparison, the exhaustive search and the matchingbased sum rate maximization algorithms are selected as two baselines. Wherein, the former one offers the optimal average latency. It can be observed that our proposed algorithm outperforms the max sum rate algorithm and their latency gap becomes larger when the involved number of VUEs increases. That is because, on the one hand, the uploading latency of cooperative perception in (5) depends on the maximum one among all cooperative VUEs rather than overall latencies of all VUEs; on the other hand, the intercell interference has a larger effect on the VUE with lower channel gain. In addition, it is also observed that our proposed algorithm achieves considerable latency performance compared with exhaustive search algorithm.
V Conclusion
In this article, we focus on a vehicular cooperative perception scenario with ultralow latency requirement, and propose a joint sensing, communication, and computation resource allocation scheme with multiagent DRL and swap matching for FVENTs, in which multiple VUEs constitute cooperative clusters to offload their computation tasks to either the cloud server or FAPs. Simulation results have verified the effectiveness and superiority of our proposed algorithms on the sum satisfaction and latency performances. In the future, it is interesting to incorporate radio sensing and investigate corresponding resource allocation for cooperative perception.
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under No. 61921003 and 62001053, and the Fundamental Research Funds for the Central Universities under No. 24820202020RC11 and 2020RC03.
References
 [1] (202012) Vehicular cooperative perception through action branching and federated reinforcement learning. ArXiv201203414 Cs. Cited by: §I, §I, §II.
 [2] (201809) Connected roads of the future: use cases, requirements, and design considerations for vehicletoeverything communications. IEEE Veh. Technol. Mag. 13 (3), pp. 110–123. Cited by: §I.
 [3] (202012) Cooperative sensing and task offloading for autonomous platoons. In Proc. IEEE GLOBECOM, Taipei, Taiwan, pp. 1–6. Cited by: §I, §I, §IIB.
 [4] (201906) Valueanticipating V2V communications for cooperative perception. In Proc. IEEE IV, Paris, France, pp. 1947–1952. Cited by: §I, §I.
 [5] (201905) Actorattentioncritic for multiagent reinforcement learning. In Proc. ICML, California, USA, pp. 2961–2970. Cited by: §IIIA2.
 [6] (202002) Cooperative LIDAR object detection via feature sharing in deep networks. ArXiv200208440 Cs. Cited by: §I.
 [7] (2019) Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues. IEEE Commun. Surv. Tutor. 21 (4), pp. 3072–3108. Cited by: §IIIA2.
 [8] (201806) Deployment and performance of infrastructure to assist vehicular collaborative sensing. In Proc. IEEE VTC Spring, Porto, Portugal, pp. 1–5. Cited by: §I.
 [9] (202105) Machinelearningenabled cooperative perception for connected autonomous vehicles: challenges and opportunities. IEEE Netw. 35 (3), pp. 96–101. Cited by: §I.
 [10] (202002) Mobile edge intelligence and computing for the internet of vehicles. Proc. IEEE 108 (2), pp. 246–261. Cited by: §I.
 [11] (202102) Delayoptimized resource allocation in fogbased vehicular networks. IEEE Internet Things J. 8 (3), pp. 1347–1357. Cited by: §I.
 [12] (202007) Deepreinforcementlearningbased mode selection and resource allocation for cellular V2X communications. IEEE Internet Things J. 7 (7), pp. 6380–6391. Cited by: §IIIA.
Comments
There are no comments yet.