Multi Agent Reinforcement Learning Trajectory Design and Two-Stage Resource Management in CoMP UAV VLC Networks

Citation Author(s):
Mohammad Reza
Maleki
Mohammad
Robat Mili
Mohammad Reza
Javan
Senior Member, IEEE
Nader
Mokari
Senior Member, IEEE
Eduard
A. Jorswieck
Fellow, IEEE
Submitted by:
Mohammad Reza Maleki
Last updated:
Fri, 12/03/2021 - 03:12
DOI:
10.21227/4tg2-f112
Data Format:
Links:
License:
581 Views
Categories:
Keywords:
0
0 ratings - Please login to submit your rating.

Abstract 

In this paper, we consider unmanned aerial vehicles (UAVs) equipped with a visible light communication (VLC) access point and coordinated multipoint (CoMP) capability that allows users to connect to more than one UAV. UAVs can move in 3-dimensional (3D) at a constant acceleration, where a central server is responsible for synchronization and cooperation among UAVs. The effect of accelerated motion in UAV is necessary to be considered. We define the data rate for each user type, CoMP, and non-CoMP. Unlike most existing works, we examine the effects of variable speed on kinetics and radio resource allocations. For the proposed system model, we define two different time frames. In the frames, the acceleration of each UAV is specified, and in each slot, radio resources are allocated. The initial velocity in each slot is obtained from the previous time slot's velocity. Our goal is to formulate a multiobjective optimization problem where the total data rate is maximized and the total communication power consumption is minimized simultaneously. To handle this multiobjective optimization, we first apply the scalarization method and then apply multi-agent deep deterministic policy gradient (MADDPG) which is a multi-agent method based on deep deterministic policy gradient (DDPG) that ensures stable and fast convergence. We improve this solution method by adding two critic networks together with two-stage resources allocation. Simulation results indicate that the constant acceleration motion of UAVs shows about $8 \%$ better results than conventional motion systems in terms of performance. Furthermore, CoMP supports the system to achieve on average $12 \%$ higher rates compared to a non-CoMP system.

Instructions: 

Code was developed by torch data

Comments

1

Submitted by jw d on Wed, 11/01/2023 - 05:53

thank you

Submitted by wowo tian on Thu, 02/01/2024 - 05:02