Skip to content

Commit

Permalink
Merge branch 'NeuralNetworkOutput' into neglogp+entropy
Browse files Browse the repository at this point in the history
  • Loading branch information
ChengYen-Tang committed Apr 9, 2020
2 parents 1e3b7a9 + a93db61 commit 352224f
Show file tree
Hide file tree
Showing 74 changed files with 2,823 additions and 1,262 deletions.
2 changes: 1 addition & 1 deletion .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,6 @@
- [ ] My change requires a change to the documentation.
- [ ] I have updated the tests accordingly (*required for a bug fix or a new feature*).
- [ ] I have updated the documentation accordingly.
- [ ] I have ensured `pytest` and `pytype` both pass.
- [ ] I have ensured `pytest` and `pytype` both pass (by running `make pytest` and `make type`).

<!--- This Template is an edited version of the one from https://github.com/evilsocket/pwnagotchi/ -->
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ python:

env:
global:
- DOCKER_IMAGE=stablebaselines/stable-baselines-cpu:v2.9.0
- DOCKER_IMAGE=stablebaselines/stable-baselines-cpu:v2.10.0

notifications:
email: false
Expand Down Expand Up @@ -42,7 +42,7 @@ jobs:

- name: "Sphinx Documentation"
script:
- 'docker run -it --rm --mount src=$(pwd),target=/root/code/stable-baselines,type=bind ${DOCKER_IMAGE} bash -c "cd /root/code/stable-baselines/ && pip install .[docs] && pushd docs/ && make clean && make html"'
- 'docker run -it --rm --mount src=$(pwd),target=/root/code/stable-baselines,type=bind ${DOCKER_IMAGE} bash -c "cd /root/code/stable-baselines/ && pushd docs/ && make clean && make html"'

- name: "Type Checking"
script:
Expand Down
37 changes: 28 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,17 +57,17 @@ from stable_baselines import PPO2

In general, we recommend using pycharm to format everything in an efficient way.

Please documentation each function/method using the following template:
Please document each function/method and [type](https://google.github.io/pytype/user_guide.html) them using the following template:

```python

def my_function(arg1, arg2):
def my_function(arg1: type1, arg2: type2) -> returntype:
"""
Short description of the function.
:param arg1: (arg1 type) describe what is arg1
:param arg2: (arg2 type) describe what is arg2
:return: (return type) describe what is returned
:param arg1: (type1) describe what is arg1
:param arg2: (type2) describe what is arg2
:return: (returntype) describe what is returned
"""
...
return my_variable
Expand All @@ -77,7 +77,7 @@ def my_function(arg1, arg2):

Before proposing a PR, please open an issue, where the feature will be discussed. This prevent from duplicated PR to be proposed and also ease the code review process.

Each PR need to be reviewed and accepted by at least one of the maintainers (@hill-a , @araffin or @erniejunior ).
Each PR need to be reviewed and accepted by at least one of the maintainers (@hill-a, @araffin, @erniejunior, @AdamGleave or @Miffyli).
A PR must pass the Continuous Integration tests (travis + codacy) to be merged with the master branch.

Note: in rare cases, we can create exception for codacy failure.
Expand All @@ -88,15 +88,34 @@ All new features must add tests in the `tests/` folder ensuring that everything
We use [pytest](https://pytest.org/).
Also, when a bug fix is proposed, tests should be added to avoid regression.

To run tests with `pytest` and type checking with `pytype`:
To run tests with `pytest`:

```
./scripts/run_tests.sh
make pytest
```

Type checking with `pytype`:

```
make type
```

Build the documentation:

```
make doc
```

Check documentation spelling (you need to install `sphinxcontrib.spelling` package for that):

```
make spelling
```


## Changelog and Documentation

Please do not forget to update the changelog and add documentation if needed.
Please do not forget to update the changelog (`docs/misc/changelog.rst`) and add documentation if needed.
A README is present in the `docs/` folder for instructions on how to build the documentation.


Expand Down
3 changes: 2 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,13 @@ ENV VENV /root/venv

COPY ./setup.py ${CODE_DIR}/stable-baselines/setup.py
RUN \
pip install pip --upgrade && \
pip install virtualenv && \
virtualenv $VENV --python=python3 && \
. $VENV/bin/activate && \
pip install --upgrade pip && \
cd ${CODE_DIR}/stable-baselines && \
pip install -e .[mpi,tests] && \
pip install -e .[mpi,tests,docs] && \
rm -rf $HOME/.cache/pip

ENV PATH=$VENV/bin:$PATH
Expand Down
41 changes: 41 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Run pytest and coverage report
pytest:
./scripts/run_tests.sh

# Type check
type:
pytype

# Build the doc
doc:
cd docs && make html

# Check the spelling in the doc
spelling:
cd docs && make spelling

# Clean the doc build folder
clean:
cd docs && make clean

# Build docker images
# If you do export RELEASE=True, it will also push them
docker: docker-cpu docker-gpu

docker-cpu:
./scripts/build_docker.sh

docker-gpu:
USE_GPU=True ./scripts/build_docker.sh

# PyPi package release
release:
python setup.py sdist
python setup.py bdist_wheel
twine upload dist/*

# Test PyPi package release
test-release:
python setup.py sdist
python setup.py bdist_wheel
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,11 +144,14 @@ Please read the [documentation](https://stable-baselines.readthedocs.io/) for mo

All the following examples can be executed online using Google colab notebooks:

- [Getting Started](https://colab.research.google.com/drive/1_1H5bjWKYBVKbbs-Kj83dsfuZieDNcFU)
- [Training, Saving, Loading](https://colab.research.google.com/drive/1KoAQ1C_BNtGV3sVvZCnNZaER9rstmy0s)
- [Multiprocessing](https://colab.research.google.com/drive/1ZzNFMUUi923foaVsYb4YjPy4mjKtnOxb)
- [Monitor Training and Plotting](https://colab.research.google.com/drive/1L_IMo6v0a0ALK8nefZm6PqPSy0vZIWBT)
- [Atari Games](https://colab.research.google.com/drive/1iYK11yDzOOqnrXi1Sfjm1iekZr4cxLaN)
- [Full Tutorial](https://github.com/araffin/rl-tutorial-jnrr19)
- [All Notebooks](https://github.com/Stable-Baselines-Team/rl-colab-notebooks)
- [Getting Started](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/stable_baselines_getting_started.ipynb)
- [Training, Saving, Loading](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/saving_loading_dqn.ipynb)
- [Multiprocessing](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/multiprocessing_rl.ipynb)
- [Monitor Training and Plotting](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/monitor_training.ipynb)
- [Atari Games](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/atari_games.ipynb)
- [RL Baselines Zoo](https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/master/rl-baselines-zoo.ipynb)


## Implemented Algorithms
Expand Down Expand Up @@ -190,7 +193,7 @@ Some of the baselines examples use [MuJoCo](http://www.mujoco.org) (multi-joint
All unit tests in baselines can be run using pytest runner:
```
pip install pytest pytest-cov
pytest --cov-config .coveragerc --cov-report html --cov-report term --cov=.
make pytest
```

## Projects Using Stable-Baselines
Expand Down
2 changes: 1 addition & 1 deletion docs/common/distributions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Probability distributions used for the different action spaces:
- ``MultiCategoricalProbabilityDistribution`` -> MultiDiscrete
- ``BernoulliProbabilityDistribution`` -> MultiBinary

The policy networks output parameters for the distributions (named `flat` in the methods).
The policy networks output parameters for the distributions (named ``flat`` in the methods).
Actions are then sampled from those distributions.

For instance, in the case of discrete actions. The policy network outputs probability
Expand Down
4 changes: 2 additions & 2 deletions docs/guide/algos.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Actions ``gym.spaces``:

.. note::

Some logging values (like `ep_rewmean`, `eplenmean`) are only available when using a Monitor wrapper
Some logging values (like ``ep_rewmean``, ``eplenmean``) are only available when using a Monitor wrapper
See `Issue #339 <https://github.com/hill-a/stable-baselines/issues/339>`_ for more info.


Expand All @@ -62,7 +62,7 @@ Completely reproducible results are not guaranteed across Tensorflow releases or
Furthermore, results need not be reproducible between CPU and GPU executions, even when using identical seeds.

In order to make make computations deterministic on CPU, on your specific problem on one specific platform,
you need to pass a `seed` argument at the creation of a model and set `n_cpu_tf_sess=1` (number of cpu for Tensorflow session).
you need to pass a ``seed`` argument at the creation of a model and set `n_cpu_tf_sess=1` (number of cpu for Tensorflow session).
If you pass an environment to the model using `set_env()`, then you also need to seed the environment first.

.. note::
Expand Down
Loading

0 comments on commit 352224f

Please sign in to comment.