Date on Master's Thesis/Doctoral Dissertation
12-2025
Document Type
Master's Thesis
Degree Name
M.S.
Department
Computer Engineering and Computer Science
Degree Program
Computer Science, MS
Committee Chair
Li, Hongxiang
Committee Member
Yu, Rui
Committee Member
Baidya, Sabur
Author's Keywords
Multi-agent reinforcement learning; 3D trajectory planning; Spectrum allocation; Urban Air Mobility (UAM); VD3QN; Spectrum-constrained path optimization
Abstract
Advanced Air Mobility (AAM) and Urban Air Mobility (UAM) are accelerating a transformation of air transportation but face acute spectrum congestion in dense urban environments. Reliable Control and Non-Payload Communications (CNPC) must be maintained at all times to ensure safe operations, even as fleets of aerial vehicles (AVs) transport passengers and cargo between distributed vertiports. We first develop a 2D formulation that jointly optimizes discrete headings, velocities, and spectrum allocation to minimize total mission time while satisfying quality of service (QoS) and collision-avoidance constraints, and we demonstrate significant gains over non-learning and learning baselines. Building on this 2D framework, we then extend to a full 3D setting by incorporating altitude maneuvers via vertical-angle decisions, which enlarges the feasible maneuver space, enhances deconfliction, and improves energy efficiency while remaining compatible with existing traffic management systems. Both problems are modeled as multi-stage Markov games with tightly coupled decisions and are solved using a cooperative multi-agent deep reinforcement learning algorithm, Value Decomposition Dueling Double Deep Q-Network (VD3QN), which integrates value decomposition with dueling double Deep Q-Networks (DQNs) to jointly plan paths and allocate spectrum. As a non-learning baseline, we implement Orthogonal Multiple Access (OMA) with Space-Time A* (A-star) for sequential path planning under exclusive channel usage. Extensive simulations in both 2D and 3D show that VD3QN consistently outperforms OMA+Space-Time A* and learning-based alternatives such as Qmix across a range of signal-to-interference-plus-noise ratio (SINR) thresholds, channel counts, safety radii, and turning-angle constraints. The 3D extension further boosts operational efficiency and better captures real-world aviation dynamics, confirming that advancing from 2D to 3D yields tangible benefits for spectrum-aware AAM/UAM operations.
Recommended Citation
Li, Qingyang, "From 2D to 3D: Multi-agent reinforcement learning for spectrum-constrained urban air mobility." (2025). Electronic Theses and Dissertations. Paper 4694.
Retrieved from https://ir.library.louisville.edu/etd/4694