Our weekly reading group where we discuss Reinforcement Learning approaches in the domain of Natural Language Processing.
Please cite this github repository if you are using our slides.
Last updated: 04/04/2019
-
Presented by Yanjun Gao.
Readings:
- Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning by Ronald J. Williams. [1]
- Sequence Level Training with Recurrent Neural Networks by Marc’Aurelio Ranzato, Sumit Chopra, Michael Auli, and Wojciech Zaremba. [2]
Find the presentation here.
-
Presented by Kyriaki Zafeiroudi.
Reading: Playing Atari with Deep Reinforcement Learning by Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. [3]
Find the presentation here.
-
Background of Reinforcement Learnng, Imitation Learning and Human-level control through deep reinforcement learning
Presented by Saptarashmi Bandyopadhyay.
Readings: Reinforcement Learning An Introduction by Richard S. Sutton and Andrew G. Barto [4] Chapters 1, 2 and 3
A Course in Machine Learning by Haul Daumé III [5] Chapter 18
Human-level control through deep reinforcement learning by Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg & Demis Hassabis [6]
Atari 2600 game console in the paper The arcade learning environment: An evaluation platform for general agents by Marc G. Bellemare, Yavar Naddaf, Joel Veness amd Michael Bowling [7]Find the presentation here.
-
Presented by Saptarashmi Bandyopadhyay.
Readings: Deep Reinforcement Learning with a Natural Language Action Space by Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng and Mari Ostendorf, University of Washington, Seattle, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1621–1630, Berlin, Germany, August 7-12, 2016 [10] Language Understanding for Text-based Games using Deep Reinforcement Learning by Karthik Narasimhan, Tejas D Kulkarni and Regina Barzilay, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1–11, Lisbon, Portugal, 17-21 September 2015 [11]
Find the presentation here.
-
Presented by Yanjun Gao.
Readings: Silver, David, et al. "Deterministic policy gradient algorithms." ICML. 2014. [19] Lillicrap, Timothy P., et al. "Continuous Control with Deep Reinforcement Learning." (2015). [20]
Find the presentation here.
-
Presented by Yanjun Gao.
Readings: Hausknecht, Matthew, and Peter Stone. "Deep reinforcement learning in parameterized action space." ICLR 2016. [21]
Find the presentation here.
-
Presented by Maryam Zare.
Readings: M. Gasic, N. Mrksic, L. Rojas-Barahona, P.-H. Su, S. Ultes, D. Vandyke, T.-H. Wen and S. Young . "Dialogue manager domain adaptation using Gaussian process reinforcement learning" Computer Speech and Language. (2017). [18]
Find the presentation here
-
Presented by Saptarashmi Bandyopadhyay
Readings: Deep Reinforcement Learning with Double Q-Learning by Hado van Hasselt , Arthur Guez, and David Silver, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16),2016 [13].
Find the presentation here.
-
Presented by Maryam Zare.
Readings: Gašić, Milica, and Steve Young. "Gaussian processes for pomdp-based dialogue manager optimization." IEEE/ACM Transactions on Audio, Speech, and Language Processing(2014) [17]
Find the presentation here
-
Presented by Saptarashmi Bandyopadhyay
Readings: Dueling Network Architectures for Deep Reinforcement Learning by Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, and Nando de Freitas, Proceedings of the 33rd International Conference on Machine Learning New York, NY, USA, 2016. JMLR: W&CP volume 48. [14].
Find the presentation here.
-
Presented by Yanjun Gao.
Readings: Asynchronous Methods for Deep Reinforcement Learning Mnih, Volodymyr, et al. International conference on machine learning. 2016. [15]
Find the presentation here
-
Presented by Yanjun Gao.
Readings: Toward Diverse Text Generation with Inverse Reinforcement Learning Shi, Zhan, et al. Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, 2018.[16]
Find the presentation here.
- A great Video on GP and ML by Richard Turner
- Awesome RL
- Class Resources: David Silver
- Class Resources: UC Berkeley CS 294-112 Deep RL
- Simple Reinforcement Learning with Tensorflow
- Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. 8, 3-4 (May 1992), 229-256. DOI: https://doi.org/10.1007/BF00992696
- Ranzato, M. Chopra, S. Auli, M. and Zaremba, W. 2015. Sequence level training with recurrent neural networks. CoRR abs/1511.06732.
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D. & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning (cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013)
- Reinforcement Learning An Introduction by Richard S. Sutton and Andrew G. Barto, Second Edition, 2018, MIT Press
- ciml.info/
- Human-level control through deep reinforcement learning by V Mnih, K Kavukcuoglu, D Silver, A Rusu, J Veness, M G Bellemare, A Graves, M Riedmiller, A Fidjeland, G Ostrovski, S Petersen, C Beattie, A Sadik, I Antonoglou, H King, D Kumaran, D Wierstra, S Legg, D Hassabis, Nature journal, Volume 545, pp. 529-533, February 2015, DOI:10.1038/nature14236
- The arcade learning environment: An evaluation platform for general agents by Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling, Journal of Artificial Intelligence Research 47, pages 253-279, DOI: 10.1613/jair.3912 arXiv:1207.4708
- Silver, David, et al. "Deterministic policy gradient algorithms." ICML. 2014.
- Lillicrap, Timothy P., et al. "Continuous Control with Deep Reinforcement Learning." (2015).
- Hausknecht, Matthew, and Peter Stone. "Deep reinforcement learning in parameterized action space." ICLR 2016.
- Deep Reinforcement Learning with a Natural Language Action Space by Ji He, Jianshu Chen, Xiaodong He, Jianfeng Gao, Lihong Li, Li Deng and Mari Ostendorf, University of Washington, Seattle, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1621–1630, Berlin, Germany, August 7-12, 2016.
- Language Understanding for Text-based Games using Deep Reinforcement Learning by Karthik Narasimhan, Tejas D Kulkarni and Regina Barzilay, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 1–11, Lisbon, Portugal, 17-21 September 2015
- Deep Reinforcement Learning with Double Q-Learning by Hado van Hasselt , Arthur Guez, and David Silver, Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16),2016.
- Dueling Network Architectures for Deep Reinforcement Learning by Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, and Nando de Freitas, Proceedings of the 33rd International Conference on Machine Learning New York, NY, USA, 2016. JMLR: W&CP volume 48.
- Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." International conference on machine learning. 2016.
- Shi, Zhan, et al. "Toward diverse text generation with inverse reinforcement learning." Proceedings of the 27th International Joint Conference on Artificial Intelligence. AAAI Press, 2018.
- Gašić, Milica, and Steve Young. (http://mi.eng.cam.ac.uk/~sjy/papers/gayo14.pdf) IEEE/ACM Transactions on Audio, Speech, and Language Processing(2014) 18.M. Gasic, N. Mrksic, L. Rojas-Barahona, P.-H. Su, S. Ultes, D. Vandyke, T.-H. Wen and S. Young . "[Dialogue manager domain adaptation using Gaussian process reinforcement learning]" Computer Speech and Language. (2017).
- Silver, David, et al. "Deterministic policy gradient algorithms." ICML. 2014.
- Lillicrap, Timothy P., et al. "Continuous Control with Deep Reinforcement Learning." (2015).
- Hausknecht, Matthew, and Peter Stone. "Deep reinforcement learning in parameterized action space." ICLR 2016.