CleanRL Tutorial#

This tutorial shows how to use CleanRL to implement a model and train it on a PettingZoo environment.

Implementing PPO: Implement and train a PPO model

CleanRL Overview#

CleanRL is a lightweight, highly-modularized reinforcement learning library, providing high-quality single-file implementations with research-friendly features.

See the documentation for more information.

Official examples using PettingZoo:#

PPO PettingZoo Atari example

WandB Integration#

A key feature is its tight integration with Weights & Biases (WandB): for experiment tracking, hyperparameter tuning, and benchmarking. The Open RL Benchmark allows users to view public leaderboards for many tasks, including videos of agents’ performance across training timesteps.

CleanRl integration with Weights & Biases