Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to customize policy network for SAC #349

Closed
yutao-li opened this issue Jun 2, 2019 · 2 comments
Closed

how to customize policy network for SAC #349

yutao-li opened this issue Jun 2, 2019 · 2 comments
Labels
question Further information is requested RTFM Answer is the documentation

Comments

@yutao-li
Copy link

yutao-li commented Jun 2, 2019

If you have any questions, feel free to create an issue with the tag [question].
If you wish to suggest an enhancement or feature request, add the tag [feature request].
If you are submitting a bug report, please fill in the following details.

Describe the bug

A clear and concise description of what the bug is.

I follow the instructions to customize a policy network for SAC, but it does not work. Can you show a brief example on how to do that?

Code example
Please try to provide a minimal example to reproduce the bug. Error messages and stack traces are also helpful.

from stable_baselines import SAC
from stable_baselines.sac.policies import LnMlpPolicy

agent = SAC(LnMlpPolicy, "Pendulum-v0", policy_kwargs=dict(net_arch=[128, 128, dict(pi=[64], vf=[64])]))
Traceback (most recent call last):
  File "/datadrive/yutao/.pycharm_helpers/pydev/pydev_run_in_console.py", line 53, in run_file
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/datadrive/yutao/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/datadrive/yutao/scratch/scratch_8.py", line 4, in <module>
    agent = SAC(LnMlpPolicy, "Pendulum-v0", policy_kwargs=dict(net_arch=[128, 128, dict(pi=[64], vf=[64])]))
  File "/datadrive/yutao/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 123, in __init__
    self.setup_model()
  File "/datadrive/yutao/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/sac.py", line 145, in setup_model
    **self.policy_kwargs)
  File "/datadrive/yutao/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/policies.py", line 365, in __init__
    feature_extraction="mlp", layer_norm=True, **_kwargs)
  File "/datadrive/yutao/anaconda3/lib/python3.7/site-packages/stable_baselines/sac/policies.py", line 189, in __init__
    self._kwargs_check(feature_extraction, kwargs)
  File "/datadrive/yutao/anaconda3/lib/python3.7/site-packages/stable_baselines/common/policies.py", line 177, in _kwargs_check
    raise ValueError("Unknown keywords for policy: {}".format(kwargs))
ValueError: Unknown keywords for policy: {'net_arch': [128, 128, {'pi': [64], 'vf': [64]}]}

System Info
Describe the characteristic of your environment:

  • Describe how the library was installed (pip, docker, source, ...)
  • GPU models and configuration
  • Python version
  • Tensorflow version
  • Versions of any other relevant libraries

Additional context
Add any other context about the problem here.

@araffin araffin added the RTFM Answer is the documentation label Jun 2, 2019
@araffin
Copy link
Collaborator

araffin commented Jun 2, 2019

Hello,
Please read carefully the documentation of SAC:

"The SAC model does not support stable_baselines.common.policies because it uses double q-values and value estimation, as a result it must use its own policy models (see SAC Policies)."

The net_arch keyword is for stable_baselines.common.policies only, you have to use layers in that case.

@araffin araffin closed this as completed Jun 2, 2019
@araffin araffin added the question Further information is requested label Jun 2, 2019
@yutao-li
Copy link
Author

yutao-li commented Jun 3, 2019

ok, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested RTFM Answer is the documentation
Projects
None yet
Development

No branches or pull requests

2 participants