Skip to content

ajlangley/trpo-pytorch

Repository files navigation

TRPO + GAE

An implementation of Trust Region Policy Optimization (Schulman 2015) with Generalized Advantage Estimation (Schulman 2016). This implementation can handle environments with both discrete and continuous action spaces.

Results

Below are this implementation's results on three different simulated locomotion tasks, each averaged over five runs.

alt-text-1 alt-text-2 alt-text-3

About

An implementation of TRPO with GAE in PyTorch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages