# Deep Reinforcement Learning 1. **Overview.** * Reinforcement Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_1.pdf)] [[lecture note](https://github.com/wangshusen/DeepLearning/blob/master/LectureNotes/DRL/DRL.pdf)] [[Video (in Chinese)](https://youtu.be/vmkRMvhCW5c)]. * Value-Based Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_2.pdf)] [[Video (in Chinese)](https://youtu.be/jflq6vNcZyA)]. * Policy-Based Learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_3.pdf)] [[Video (in Chinese)](https://youtu.be/qI0vyfR2_Rc)]. * Actor-Critic Methods [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_4.pdf)] [[Video (in Chinese)](https://youtu.be/xjd7Jq9wPQY)]. * AlphaGo [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_5.pdf)] [[Video (in Chinese)](https://youtu.be/zHojAp5vkRE)]. 2. **TD Learning.** * Sarsa [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_1.pdf)] [[Video (in Chinese)](https://youtu.be/-cYWdUubB6Q)]. * Q-learning [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_2.pdf)] [[Video (in Chinese)](https://youtu.be/Ymy2w3DGn2U)]. * Multi-Step TD Target [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/2_TD_3.pdf)] [[Video (in Chinese)](https://youtu.be/UqTP138IATc)]. 3. **Advanced Topics on Value-Based Learning.** * Experience Replay (ER) & Prioritized ER [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_1.pdf)] [[Video (in Chinese)](https://youtu.be/rhslMPmj7SY)]. * Overestimation, Target Network, & Double DQN [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_2.pdf)] [[Video (in Chinese)](https://youtu.be/X2-56QN79zc)]. * Dueling Networks [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/3_DQN_3.pdf)] [[Video (in Chinese)](https://youtu.be/DBux6cA0EoM)]. 4. **Policy Gradient with Baseline.** * Policy Gradient with Baseline [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_1.pdf)] [[Video (in Chinese)](https://youtu.be/yNEqbptitZs)]. * REINFORCE with Baseline [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_2.pdf)] [[Video (in Chinese)](https://youtu.be/Ob78ADXTQNo)]. * Advantage Actor-Critic (A2C) [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_3.pdf)] [[Video (in Chinese)](https://youtu.be/mtT4TSGSon8)]. * REINFORCE versus A2C [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_Policy_4.pdf)] [[Video (in Chinese)](https://youtu.be/hN9WMIMMeAI)]. 5. **Advanced Topics on Policy-Based Learning.** * Trust-Region Policy Optimization (TRPO) [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/5_Policy_1.pdf)] [[Video (in Chinese)](https://youtu.be/fcSYiyvPjm4)]. * Partial Observation and RNNs. 6. **Dealing with Continuous Action Space.** * Discrete versus Continuous Control [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_1.pdf)] [[Video (in Chinese)](https://youtu.be/rRIjgdxSvg8)]. * Deterministic Policy Gradient (DPG) for Continuous Control [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_2.pdf)] [[Video (in Chinese)](https://youtu.be/cmWejKRWLA8)]. * Stochastic Policy Gradient for Continuous Control [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/6_Continuous_3.pdf)] [[Video (in Chinese)](https://youtu.be/McqFyl_W5Wc)]. 7. **Multi-Agent Reinforcement Learning.** * Basics and Challenges [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_1.pdf)] [[Video (in Chinese)](https://youtu.be/KN-XMQFTD0o)]. * Centralized VS Decentralized [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/7_MARL_2.pdf)] [[Video (in Chinese)](https://youtu.be/0HV1hsjd1y8)]. 8. **Imitation Learning.** * Inverse Reinforcement Learning. * Generative Adversarial Imitation Learning (GAIL).