Skip to content

🦎 New strategies, API flexibility, small fixes

Compare
Choose a tag to compare
@RobertTLange RobertTLange released this 08 Dec 11:51
· 69 commits to main since this release
d1c38ef
Added
  • Adds a total_env_steps counter to both GymFitness and BraxFitness for easier sample efficiency comparability with RL algorithms.
  • Support for new strategies/genetic algorithms
    • SAMR-GA (Clune et al., 2008)
    • GESMR-GA (Kumar et al., 2022)
    • SNES (Wierstra et al., 2014)
    • DES (Lange et al., 2022)
    • Guided ES (Maheswaranathan et al., 2018)
    • ASEBO (Choromanski et al., 2019)
    • CR-FM-NES (Nomura & Ono, 2022)
    • MR15-GA (Rechenberg, 1978)
  • Adds full set of BBOB low-dimensional functions (BBOBFitness)
  • Adds 2D visualizer animating sampled points (BBOBVisualizer)
  • Adds Evosax2JAXWrapper to wrap all evosax strategies
  • Adds Adan optimizer (Xie et al., 2022)
Changed
  • ParameterReshaper can now be directly applied from within the strategy. You simply have to provide a pholder_params pytree at strategy instantiation (and no num_dims).
  • FitnessShaper can also be directly applied from within the strategy. This makes it easier to track the best performing member across generations and addresses issue #32. Simply provide the fitness shaping settings as args to the strategy (maximize, centered_rank, ...)
  • Removes Brax fitness (use EvoJAX version instead)
  • Add lrate and sigma schedule to strategy instantiation
Fixed
  • Fixed reward masking in GymFitness. Using jnp.sum(dones) >= 1 for cumulative return computation zeros out the final timestep, which is wrong. That's why there were problems with sparse reward gym environments (e.g. Mountain Car).
  • Fixed PGPE sample indexing.
  • Fixed weight decay. Falsely multiplied by -1 when maximization.