fetch planner

Details need to be determinted

change loss function
Analysis the performance of predictor training
Unify the coordinate of mujoco and ros fetch simulation
joint training

TODO List

Reduce RL traning steps
Baseline training
Whether fine-tuning for predictor training
Whether smooth training process (two datasets)
Different predictor network sizes
Predict only end-effector
GUI (@xuanz)

Environment

python3.6
tensorflow==1.12

Install

install the latest gym version
install mujoco-py == 1.5.1.1
setup environment

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-396
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/$YOUR_HOME_DIR/.mujoco/mjpro150/bin
# export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so:/usr/lib/nvidia-396/libGL.so

copy gym files to your gym directory

cp gym_file/jointvel.xml to $GYM_PATH/gym/envs/robotics/assets/fetch
cp gym_file/shared_xz.xml to $GYM_PATH/gym/envs/robotics/assets/fetch

install baselines

cd baselines
pip install -e .

if can not import baseline.logger: remove old package and reinstall baselines

Run

Download pretrained model
joint training RL policy with seq2seq predictor

bash train_cycle.sh ${ITER_STEP} ${PRED_WEIGHT}

visualize rl training process

python results_plotter.py --log_num=${ITER_STEP}

Unit Test

Env code

python env_test.py

RL code

cd baselines/baselines/ppo2
python run.py

For training policy, please set

--train=True
--display=False
--load=False

For sampling dataset, please set

--train=False
--display=False
--load=True
--point="$YOUR_CHECKPOINT_NUMBER"

For displaying performance, please set

--train=False
--display=True
--load=True
--point="$YOUR_CHECKPOINT_NUMBER"

LSTM training code

python predictor_new.py

python predictor_new.py --test

How to get the observation without normalization

obs = env.reset()
origin_obs = env.origin_obs
done = False
while not done:
    act = actor.act(obs)
    obs, rew, done, _ = env.step(act)
    origin_obs = env.origin_obs

Change log

0.1.0

complete environment test

0.2.0

complete reward function for env
complete reset function for env

0.3.0

add reinforcement learning code to train fetch
complete no predictable reward training

0.3.5

add visualization of obs in ppo2.py (example in line 389 to 402)

0.3.6

change prediction to sequence to sequence mode
use new tensorflow seq2seq api

0.4.0

add a script for training
finish two reward framework

0.5.0

joint training

0.6.0

smooth traning process (two datasets)
reset entropy for rl training

Name		Name	Last commit message	Last commit date
Latest commit History 514 Commits
baselines		baselines
gym_fetch		gym_fetch
gym_file		gym_file
scripts		scripts
.gitignore		.gitignore
README.md		README.md
env_test.py		env_test.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fetch planner

Details need to be determinted

TODO List

Environment

Install

Run

Unit Test

How to get the observation without normalization

Change log

About

Releases

Packages

Contributors 3

Languages

MzXuan/fetch_plan

Folders and files

Latest commit

History

Repository files navigation

fetch planner

Details need to be determinted

TODO List

Environment

Install

Run

Unit Test

How to get the observation without normalization

Change log

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages