Paper Title

Multi-Agent Deep Reinforcement Learning for Policy Optimization in Sequential Data Environments with Partial Observability

Authors

Keywords

  • Multi-Agent Reinforcement Learning
  • Deep RL
  • Partial Observability
  • Policy Optimization
  • Sequential Decision-Making
  • , Decentralized Control
  • CTDE

Publication Info

Volume: 6 | Issue: 2 | Pages: 54-62

Published On

March, 2025

Downloads

Abstract

In environments characterized by high temporal complexity and incomplete information, effective policy optimization becomes a core challenge in multi-agent systems. This paper investigates the use of Multi-Agent Deep Reinforcement Learning (MADRL) under conditions of partial observability, where agents must learn to act based only on local and noisy observations. We propose a policy learning framework that incorporates recurrent neural networks (RNNs) for memory-based representation and leverages centralized training with decentralized execution (CTDE). The system is evaluated on benchmark decentralized partially observable environments, demonstrating superior stability and policy convergence compared to baseline algorithms. Our findings highlight the potential of causally-aware memory policies and attention-driven coordination in solving complex sequential tasks with minimal information.

View more »

Uploaded Document Preview