-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/happo implementation #1151
base: develop
Are you sure you want to change the base?
Feat/happo implementation #1151
Conversation
Hi @ch33nchan thanks for the contribution, just a heads up the team is on holiday till early Jan so we won't be able to review this until then. Just a note though if you would like to contribute this please make sure to stay in line with Mava's style of doing things. HAPPO should not look too different from our current MAPPO implementation e.g keep the same code structure and place things in the relevant existing folders 🙏 |
Hi @ch33nchan are you able to update this to be more in line with our current implementations? |
Hi again @ch33nchan, I see you're creating new folders ( |
HAPPO Algorithm Implementation in Mava
Overview
This implementation introduces the HAPPO (Heterogeneous-Agent Proximal Policy Optimization) algorithm into the Mava repository. HAPPO is an extension of the PPO algorithm designed for multi-agent reinforcement learning, featuring a sequential update scheme and a centralized critic. This implementation ensures compatibility with the existing structure and components of the Mava repository.
Changes and Implementation
1. HAPPO Algorithm Implementation
File:
mava/algorithms/happo.py
Description:
HAPPO
class inheriting from the baseAlgorithm
class.update
method to perform sequential updates for each agent's policy using the clipped surrogate objective.Key Points:
2. Configuration
File:
mava/configs/happo_config.py
Description:
HAPPOConfig
class inheriting from theFFIPPOConfig
class found inmava.configs.system.ppo.ff_ippo
.Key Points:
clip_param
,num_agents
, andlr
.3. Training Script
File:
scripts/train_happo.py
Description:
Trainer
class.Key Points:
Trainer
class.4. Integration with Existing Components
Files:
mava/utils/make_env.py
mava/networks/__init__.py
mava/utils/logger.py
Description:
Key Points: