深度学习精炼秘笈
til, Ilya sutskever gave john carmack this reading list of approx 30 research papers and said, ‘If you really learn all of these, you’ll know 90% of what matters today.’
- The Annotated Transformer. Sasha Rush, et al. [Blog] [Code]
- The First Law of Complexodynamics. Scott Aaronson. [Blog]
- The Unreasonable Effectiveness of Recurrent Neural Networks. Andrej Karpathy. [Blog] [Code]
- Understanding LSTM Networks. Christopher Olah. [Blog]
- Recurrent Neural Network Regularization. Wojciech Zaremba, et al. [ArXiv] [pdf] [Code]
- Keeping Neural Networks Simple by Minimizing the Description Length of the Weights. Geoffrey E. Hinton and Drew van Camp. [Paper] [pdf]
- Pointer Networks. Oriol Vinyals, et al. [Paper] [pdf]
- ImageNet Classification with Deep Convolutional Neural Networks. Alex Krizhevsky, et al. [Paper] [pdf]
- Order Matters: Sequence to sequence for sets. Oriol Vinyals, et al. [ArXiv] [pdf]
- GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism. Yanping Huang, et al. [ArXiv] [pdf]
- Deep Residual Learning for Image Recognition. Kaiming He, et al.
- Multi-Scale Context Aggregation by Dilated Convolutions. Fisher Yu and Vladlen Koltun.
- Neural Message Passing for Quantum Chemistry. Justin Gilmer, et al.
- Attention Is All You Need. Ashish Vaswani, et al.
- Neural Machine Translation by Jointly Learning to Align and Translate. Dzmitry Bahdanau, et al.
- Identity Mappings in Deep Residual Networks. Kaiming He, et al.
- A simple neural network module for relational reasoning. Adam Santoro, et al.
- Variational Lossy Autoencoder. Xi Chen, et al.
- Relational recurrent neural networks. Adam Santoro, et al.
- Quantifying the Rise and Fall of Complexity in Closed Systems: The Coffee Automaton. Scott Aaronson, et al.
- Neural Turing Machines. Alex Graves, et al.
- Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. Dario Amodei, et al.
- Scaling Laws for Neural Language Models. Jared Kaplan, et al.
- A Tutorial Introduction to the Minimum Description Length Principle. Peter Grunwald.
- Machine Super Intelligence. Shane Legg.
- Kolmogorov Complexity and Algorithmic Randomness. A.Shen, V. A. Uspensky, and N. Vereshchagin.
- CS231n: Convolutional Neural Networks for Visual Recognition.