thu-ml · Trinkle23897 · Feb 3, 2023 · Dec 29, 2022 · Dec 29, 2022 · Jan 3, 2023
diff --git a/README.md b/README.md
@@ -6,6 +6,12 @@
 
 [![PyPI](https://img.shields.io/pypi/v/tianshou)](https://pypi.org/project/tianshou/) [![Conda](https://img.shields.io/conda/vn/conda-forge/tianshou)](https://github.com/conda-forge/tianshou-feedstock) [![Read the Docs](https://img.shields.io/readthedocs/tianshou)](https://tianshou.readthedocs.io/en/master) [![Read the Docs](https://img.shields.io/readthedocs/tianshou-docs-zh-cn?label=%E4%B8%AD%E6%96%87%E6%96%87%E6%A1%A3)](https://tianshou.readthedocs.io/zh/master/) [![Unittest](https://github.com/thu-ml/tianshou/workflows/Unittest/badge.svg?branch=master)](https://github.com/thu-ml/tianshou/actions) [![codecov](https://img.shields.io/codecov/c/gh/thu-ml/tianshou)](https://codecov.io/gh/thu-ml/tianshou) [![GitHub issues](https://img.shields.io/github/issues/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/issues) [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers) [![GitHub forks](https://img.shields.io/github/forks/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/network) [![GitHub license](https://img.shields.io/github/license/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/blob/master/LICENSE)
 
+> ⚠️️ **Transition to Gymnasium**: The maintainers of OpenAI Gym have recently released [Gymnasium](http://github.com/Farama-Foundation/Gymnasium), 
+> which is where future maintenance of OpenAI Gym will be taking place. 
+> Tianshou no longer supports OpenAI Gym environments and has transitioned to Gymnasium. 
+> If you would like to run Tianshou with legacy environments, you may use [Shimmy](https://github.com/Farama-Foundation/Shimmy) 
+> as a compatibility layer.
+
 **Tianshou** ([天授](https://baike.baidu.com/item/%E5%A4%A9%E6%8E%88)) is a reinforcement learning platform based on pure PyTorch. Unlike existing reinforcement learning libraries, which are mainly based on TensorFlow, have many nested classes, unfriendly API, or slow-speed, Tianshou provides a fast-speed modularized framework and pythonic API for building the deep reinforcement learning agent with the least number of lines of code. The supported interface algorithms currently include:
 
 - [Deep Q-Network (DQN)](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf)
@@ -105,21 +111,21 @@ The example scripts are under [test/](https://github.com/thu-ml/tianshou/blob/ma
 
 ### Comprehensive Functionality
 
-| RL Platform                                                        | GitHub Stars                                                                                                                                    | # of Alg. <sup>(1)</sup> | Custom Env                  | Batch Training                    | RNN Support        | Nested Observation | Backend    |
-| ------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------ | --------------------------- | --------------------------------- | ------------------ | ------------------ | ---------- |
-| [Baselines](https://github.com/openai/baselines)                   | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers)                         | 9                        | :heavy_check_mark: (gym)    | :heavy_minus_sign: <sup>(2)</sup> | :heavy_check_mark: | :x:                | TF1        |
-| [Stable-Baselines](https://github.com/hill-a/stable-baselines)     | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers)           | 11                       | :heavy_check_mark: (gym)    | :heavy_minus_sign: <sup>(2)</sup> | :heavy_check_mark: | :x:                | TF1        |
-| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3)   | [![GitHub stars](https://img.shields.io/github/stars/DLR-RM/stable-baselines3)](https://github.com/DLR-RM/stable-baselines3/stargazers)         | 7<sup> (3)</sup>         | :heavy_check_mark: (gym)    | :heavy_minus_sign: <sup>(2)</sup> | :x:                | :heavy_check_mark: | PyTorch    |
-| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers)                           | 16                       | :heavy_check_mark:          | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | TF/PyTorch |
-| [SpinningUp](https://github.com/openai/spinningup)                 | [![GitHub stars](https://img.shields.io/github/stars/openai/spinningup)](https://github.com/openai/spinningupstargazers)                        | 6                        | :heavy_check_mark: (gym)    | :heavy_minus_sign: <sup>(2)</sup> | :x:                | :x:                | PyTorch    |
-| [Dopamine](https://github.com/google/dopamine)                     | [![GitHub stars](https://img.shields.io/github/stars/google/dopamine)](https://github.com/google/dopamine/stargazers)                           | 7                        | :x:                         | :x:                               | :x:                | :x:                | TF/JAX     |
-| [ACME](https://github.com/deepmind/acme)                           | [![GitHub stars](https://img.shields.io/github/stars/deepmind/acme)](https://github.com/deepmind/acme/stargazers)                               | 14                       | :heavy_check_mark: (dm_env) | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | TF/JAX     |
-| [keras-rl](https://github.com/keras-rl/keras-rl)                   | [![GitHub stars](https://img.shields.io/github/stars/keras-rl/keras-rl)](https://github.com/keras-rl/keras-rlstargazers)                        | 7                        | :heavy_check_mark: (gym)    | :x:                               | :x:                | :x:                | Keras      |
-| [rlpyt](https://github.com/astooke/rlpyt)                          | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers)                               | 11                       | :x:                         | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
-| [ChainerRL](https://github.com/chainer/chainerrl)                  | [![GitHub stars](https://img.shields.io/github/stars/chainer/chainerrl)](https://github.com/chainer/chainerrl/stargazers)                       | 18                       | :heavy_check_mark: (gym)    | :heavy_check_mark:                | :heavy_check_mark: | :x:                | Chainer    |
-| [Sample Factory](https://github.com/alex-petrenko/sample-factory)  | [![GitHub stars](https://img.shields.io/github/stars/alex-petrenko/sample-factory)](https://github.com/alex-petrenko/sample-factory/stargazers) | 1<sup> (4)</sup>         | :heavy_check_mark: (gym)    | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
-|                                                                    |                                                                                                                                                 |                          |                             |                                   |                    |                    |            |
-| [Tianshou](https://github.com/thu-ml/tianshou)                     | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers)                           | 20                       | :heavy_check_mark: (gym)    | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
+| RL Platform                                                        | GitHub Stars                                                                                                                                    | # of Alg. <sup>(1)</sup> | Custom Env                     | Batch Training                    | RNN Support        | Nested Observation | Backend    |
+| ------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------ |--------------------------------| --------------------------------- | ------------------ | ------------------ | ---------- |
+| [Baselines](https://github.com/openai/baselines)                   | [![GitHub stars](https://img.shields.io/github/stars/openai/baselines)](https://github.com/openai/baselines/stargazers)                         | 9                        | :heavy_check_mark: (gym)       | :heavy_minus_sign: <sup>(2)</sup> | :heavy_check_mark: | :x:                | TF1        |
+| [Stable-Baselines](https://github.com/hill-a/stable-baselines)     | [![GitHub stars](https://img.shields.io/github/stars/hill-a/stable-baselines)](https://github.com/hill-a/stable-baselines/stargazers)           | 11                       | :heavy_check_mark: (gym)       | :heavy_minus_sign: <sup>(2)</sup> | :heavy_check_mark: | :x:                | TF1        |
+| [Stable-Baselines3](https://github.com/DLR-RM/stable-baselines3)   | [![GitHub stars](https://img.shields.io/github/stars/DLR-RM/stable-baselines3)](https://github.com/DLR-RM/stable-baselines3/stargazers)         | 7<sup> (3)</sup>         | :heavy_check_mark: (gym)       | :heavy_minus_sign: <sup>(2)</sup> | :x:                | :heavy_check_mark: | PyTorch    |
+| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | [![GitHub stars](https://img.shields.io/github/stars/ray-project/ray)](https://github.com/ray-project/ray/stargazers)                           | 16                       | :heavy_check_mark:             | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | TF/PyTorch |
+| [SpinningUp](https://github.com/openai/spinningup)                 | [![GitHub stars](https://img.shields.io/github/stars/openai/spinningup)](https://github.com/openai/spinningupstargazers)                        | 6                        | :heavy_check_mark: (gym)       | :heavy_minus_sign: <sup>(2)</sup> | :x:                | :x:                | PyTorch    |
+| [Dopamine](https://github.com/google/dopamine)                     | [![GitHub stars](https://img.shields.io/github/stars/google/dopamine)](https://github.com/google/dopamine/stargazers)                           | 7                        | :x:                            | :x:                               | :x:                | :x:                | TF/JAX     |
+| [ACME](https://github.com/deepmind/acme)                           | [![GitHub stars](https://img.shields.io/github/stars/deepmind/acme)](https://github.com/deepmind/acme/stargazers)                               | 14                       | :heavy_check_mark: (dm_env)    | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | TF/JAX     |
+| [keras-rl](https://github.com/keras-rl/keras-rl)                   | [![GitHub stars](https://img.shields.io/github/stars/keras-rl/keras-rl)](https://github.com/keras-rl/keras-rlstargazers)                        | 7                        | :heavy_check_mark: (gym)       | :x:                               | :x:                | :x:                | Keras      |
+| [rlpyt](https://github.com/astooke/rlpyt)                          | [![GitHub stars](https://img.shields.io/github/stars/astooke/rlpyt)](https://github.com/astooke/rlpyt/stargazers)                               | 11                       | :x:                            | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
+| [ChainerRL](https://github.com/chainer/chainerrl)                  | [![GitHub stars](https://img.shields.io/github/stars/chainer/chainerrl)](https://github.com/chainer/chainerrl/stargazers)                       | 18                       | :heavy_check_mark: (gym)       | :heavy_check_mark:                | :heavy_check_mark: | :x:                | Chainer    |
+| [Sample Factory](https://github.com/alex-petrenko/sample-factory)  | [![GitHub stars](https://img.shields.io/github/stars/alex-petrenko/sample-factory)](https://github.com/alex-petrenko/sample-factory/stargazers) | 1<sup> (4)</sup>         | :heavy_check_mark: (gym)       | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
+|                                                                    |                                                                                                                                                 |                          |                                |                                   |                    |                    |            |
+| [Tianshou](https://github.com/thu-ml/tianshou)                     | [![GitHub stars](https://img.shields.io/github/stars/thu-ml/tianshou)](https://github.com/thu-ml/tianshou/stargazers)                           | 20                       | :heavy_check_mark: (Gymnasium) | :heavy_check_mark:                | :heavy_check_mark: | :heavy_check_mark: | PyTorch    |
 
 <sup>(1): access date: 2021-08-08</sup>
 
@@ -175,7 +181,8 @@ This is an example of Deep Q Network. You can also run the full script at [test/
 First, import some relevant packages:
 
 ```python
-import gym, torch, numpy as np, torch.nn as nn
+import gymnasium as gym
+import torch, numpy as np, torch.nn as nn
 from torch.utils.tensorboard import SummaryWriter
 import tianshou as ts
 ```

diff --git a/docs/tutorials/cheatsheet.rst b/docs/tutorials/cheatsheet.rst
@@ -4,7 +4,7 @@ Cheat Sheet
 This page shows some code snippets of how to use Tianshou to develop new
 algorithms / apply algorithms to new scenarios.
 
-By the way, some of these issues can be resolved by using a ``gym.Wrapper``.
+By the way, some of these issues can be resolved by using a ``gymnasium.Wrapper``.
 It could be a universal solution in the policy-environment interaction. But
 you can also use the batch processor :ref:`preprocess_fn` or vectorized
 environment wrapper :class:`~tianshou.env.VectorEnvWrapper`.
@@ -159,7 +159,7 @@ toy_text and classic_control environments. For more information, please refer to
     # install envpool: pip3 install envpool
 
     import envpool
-    envs = envpool.make_gym("CartPole-v0", num_envs=10)
+    envs = envpool.make_gymnasium("CartPole-v0", num_envs=10)
     collector = Collector(policy, envs, buffer)
 
 Here are some other `examples <https://github.com/sail-sg/envpool/tree/master/examples/tianshou_examples>`_.

diff --git a/docs/tutorials/dqn.rst b/docs/tutorials/dqn.rst
@@ -35,10 +35,10 @@ Here is the overall system:
 Make an Environment
 -------------------
 
-First of all, you have to make an environment for your agent to interact with. You can use ``gym.make(environment_name)`` to make an environment for your agent. For environment interfaces, we follow the convention of `OpenAI Gym <https://github.com/openai/gym>`_. In your Python code, simply import Tianshou and make the environment:
+First of all, you have to make an environment for your agent to interact with. You can use ``gym.make(environment_name)`` to make an environment for your agent. For environment interfaces, we follow the convention of `Gymnasium <https://github.com/Farama-Foundation/Gymnasium>`_. In your Python code, simply import Tianshou and make the environment:
 ::
 
-    import gym
+    import gymnasium as gym
     import tianshou as ts
 
     env = gym.make('CartPole-v0')
@@ -84,8 +84,8 @@ You can also try the super-fast vectorized environment `EnvPool <https://github.
 ::
 
     import envpool
-    train_envs = envpool.make_gym("CartPole-v0", num_envs=10)
-    test_envs = envpool.make_gym("CartPole-v0", num_envs=100)
+    train_envs = envpool.make_gymnasium("CartPole-v0", num_envs=10)
+    test_envs = envpool.make_gymnasium("CartPole-v0", num_envs=100)
 
 For the demonstration, here we use the second code-block.
 

diff --git a/docs/tutorials/tictactoe.rst b/docs/tutorials/tictactoe.rst
@@ -62,15 +62,15 @@ The observation variable ``obs`` returned from the environment is a ``dict``, wi
 
 .. note::
 
-    There is no special formulation of ``mask`` either in discrete action space or in continuous action space. You can also use some action spaces like ``gym.spaces.Discrete`` or ``gym.spaces.Box`` to represent the available action space. Currently, we use a boolean array.
+    There is no special formulation of ``mask`` either in discrete action space or in continuous action space. You can also use some action spaces like ``gymnasium.spaces.Discrete`` or ``gymnasium.spaces.Box`` to represent the available action space. Currently, we use a boolean array.
 
 Let's play two steps to have an intuitive understanding of the environment.
 
 ::
 
     >>> import numpy as np
     >>> action = 0                                  # action is either an integer, or an np.ndarray with one element
-    >>> obs, reward, done, info = env.step(action)  # the env.step follows the api of OpenAI Gym
+    >>> obs, reward, done, info = env.step(action)  # the env.step follows the api of Gymnasium
     >>> print(obs)                                  # notice the change in the observation
     {'agent_id': 'player_2', 'obs': array([[[0, 1],
             [0, 0],
@@ -185,7 +185,7 @@ So let's start to train our Tic-Tac-Toe agent! First, import some required modul
     from copy import deepcopy
     from typing import Optional, Tuple
 
-    import gym
+    import gymnasium as gym
     import numpy as np
     import torch
     from pettingzoo.classic import tictactoe_v3

diff --git a/examples/atari/atari_wrapper.py b/examples/atari/atari_wrapper.py
@@ -5,7 +5,7 @@
 from collections import deque
 
 import cv2
-import gym
+import gymnasium as gym
 import numpy as np
 
 from tianshou.env import ShmemVectorEnv
@@ -324,15 +324,15 @@ def make_atari_env(task, seed, training_num, test_num, **kwargs):
                 "please set `x = x / 255.0` inside CNN network's forward function."
             )
         # parameters convertion
-        train_envs = env = envpool.make_gym(
+        train_envs = env = envpool.make_gymnasium(
             task.replace("NoFrameskip-v4", "-v5"),
             num_envs=training_num,
             seed=seed,
             episodic_life=True,
             reward_clip=True,
             stack_num=kwargs.get("frame_stack", 4),
         )
-        test_envs = envpool.make_gym(
+        test_envs = envpool.make_gymnasium(
             task.replace("NoFrameskip-v4", "-v5"),
             num_envs=test_num,
             seed=seed,

diff --git a/examples/box2d/acrobot_dualdqn.py b/examples/box2d/acrobot_dualdqn.py
@@ -2,7 +2,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch.utils.tensorboard import SummaryWriter

diff --git a/examples/box2d/bipedal_bdq.py b/examples/box2d/bipedal_bdq.py
@@ -3,7 +3,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch.utils.tensorboard import SummaryWriter

diff --git a/examples/box2d/bipedal_hardcore_sac.py b/examples/box2d/bipedal_hardcore_sac.py
@@ -2,7 +2,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch.utils.tensorboard import SummaryWriter

diff --git a/examples/box2d/lunarlander_dqn.py b/examples/box2d/lunarlander_dqn.py
@@ -2,7 +2,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch.utils.tensorboard import SummaryWriter

diff --git a/examples/box2d/mcc_sac.py b/examples/box2d/mcc_sac.py
@@ -2,7 +2,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch.utils.tensorboard import SummaryWriter

diff --git a/examples/inverse/irl_gail.py b/examples/inverse/irl_gail.py
@@ -6,7 +6,7 @@
 import pprint
 
 import d4rl
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 from torch import nn

diff --git a/examples/mujoco/fetch_her_ddpg.py b/examples/mujoco/fetch_her_ddpg.py
@@ -6,7 +6,7 @@
 import os
 import pprint
 
-import gym
+import gymnasium as gym
 import numpy as np
 import torch
 import wandb