You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
error
2023-12-11 17:22:15,948 WARNING ppo.py:395 -- train_batch_size (5000) cannot be achieved with your other settings (num_workers=4 num_envs_per_worker=1 rollout_fragment_length=100)! Auto-adjusting rollout_fragment_length to 1250.
2023-12-11 17:22:15,949 INFO ppo.py:415 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you.
2023-12-11 17:22:15,949 INFO trainer.py:906 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(RolloutWorker pid=11459) 2023-12-11 17:22:18,686 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or soft_horizon. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=11462) 2023-12-11 17:22:18,719 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or soft_horizon. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=11461) 2023-12-11 17:22:18,715 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or soft_horizon. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
(RolloutWorker pid=11460) 2023-12-11 17:22:18,748 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or soft_horizon. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.
2023-12-11 17:22:19,701 WARNING util.py:65 -- Install gputil for GPU system monitoring.
2023-12-11 17:22:19,708 INFO trainable.py:589 -- Restored on 172.25.216.219 from checkpoint: pretrained_models/reach/checkpoint_000575/checkpoint-575
2023-12-11 17:22:19,708 INFO trainable.py:597 -- Current state after restoring: {'_iteration': 575, '_timesteps_total': None, '_time_total': 24399.69826555252, '_episodes_total': 19362}
2023-12-11 17:22:21,457 WARNING deprecation.py:47 -- DeprecationWarning: compute_action has been deprecated. Use Trainer.compute_single_action() instead. This will raise an error in the future!
[Test] 1 reward -0.01 len 85.00, success mean 1.00
[Test] 2 reward -0.01 len 60.00, success mean 1.00
[Test] 3 reward -0.01 len 78.00, success mean 1.00
[Test] 4 reward -0.01 len 300.00, success mean 0.75
[Test] 5 reward -0.01 len 85.00, success mean 0.80
[Test] 6 reward -0.01 len 48.00, success mean 0.83
[Test] 7 reward -0.01 len 95.00, success mean 0.86
[Test] 8 reward -0.01 len 98.00, success mean 0.88
[Test] 9 reward -0.02 len 51.00, success mean 0.89
[Test] 10 reward -0.01 len 35.00, success mean 0.90
QMutex: destroying locked mutex
The text was updated successfully, but these errors were encountered:
"Hello, when I use the command to run the pre-trained model you provided, the following issue occurs. Could you please tell me the reason for this?
python main.py --env-id reach --load-from pretrained_models/reach/checkpoint_000575/checkpoint-575 --test
error
2023-12-11 17:22:15,948 WARNING ppo.py:395 --
train_batch_size
(5000) cannot be achieved with your other settings (num_workers=4 num_envs_per_worker=1 rollout_fragment_length=100)! Auto-adjustingrollout_fragment_length
to 1250.2023-12-11 17:22:15,949 INFO ppo.py:415 -- In multi-agent mode, policies will be optimized sequentially by the multi-GPU optimizer. Consider setting simple_optimizer=True if this doesn't work for you.
2023-12-11 17:22:15,949 INFO trainer.py:906 -- Current log_level is WARN. For more information, set 'log_level': 'INFO' / 'DEBUG' or use the -v and -vv flags.
(RolloutWorker pid=11459) 2023-12-11 17:22:18,686 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or
soft_horizon
. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.(RolloutWorker pid=11462) 2023-12-11 17:22:18,719 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or
soft_horizon
. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.(RolloutWorker pid=11461) 2023-12-11 17:22:18,715 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or
soft_horizon
. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.(RolloutWorker pid=11460) 2023-12-11 17:22:18,748 WARNING env.py:136 -- Your env doesn't have a .spec.max_episode_steps attribute. This is fine if you have set 'horizon' in your config dictionary, or
soft_horizon
. However, if you haven't, 'horizon' will default to infinity, and your environment will not be reset.2023-12-11 17:22:19,701 WARNING util.py:65 -- Install gputil for GPU system monitoring.
2023-12-11 17:22:19,708 INFO trainable.py:589 -- Restored on 172.25.216.219 from checkpoint: pretrained_models/reach/checkpoint_000575/checkpoint-575
2023-12-11 17:22:19,708 INFO trainable.py:597 -- Current state after restoring: {'_iteration': 575, '_timesteps_total': None, '_time_total': 24399.69826555252, '_episodes_total': 19362}
2023-12-11 17:22:21,457 WARNING deprecation.py:47 -- DeprecationWarning:
compute_action
has been deprecated. UseTrainer.compute_single_action()
instead. This will raise an error in the future![Test] 1 reward -0.01 len 85.00, success mean 1.00
[Test] 2 reward -0.01 len 60.00, success mean 1.00
[Test] 3 reward -0.01 len 78.00, success mean 1.00
[Test] 4 reward -0.01 len 300.00, success mean 0.75
[Test] 5 reward -0.01 len 85.00, success mean 0.80
[Test] 6 reward -0.01 len 48.00, success mean 0.83
[Test] 7 reward -0.01 len 95.00, success mean 0.86
[Test] 8 reward -0.01 len 98.00, success mean 0.88
[Test] 9 reward -0.02 len 51.00, success mean 0.89
[Test] 10 reward -0.01 len 35.00, success mean 0.90
QMutex: destroying locked mutex
The text was updated successfully, but these errors were encountered: