AI Crash Course to help busy builders catch up to the public frontier of AI research in 2 weeks
Intro: I’m Henry Shi and I started Super.com in 2016 and grew it to $150MM+ in annual revenues and recently exited. As a traditional software founder, I needed to quickly catch up to the frontier of AI research to figure out where the next opportunities and gaps were. I compiled a list of resources that were essential for me and should get you caught up within 2 weeks.
Start Here:
Neural Network -> LLM Series
Then get up to speed via Survey papers:
- Follow the ideas in the survey paper that interest you and dig deeper
LLM Survey - 2024
Agent Survey - 2023
Prompt Engineering Survey - 2024
AI Papers: (prioritize ones with star *)
Foundational Modelling:
Transformers* (foundation, self-attention) - 2017
Scaling Laws/GPT3* (conviction to scale up GPT2/3/4) - 2020
LoRA (Fine tuning) - 2021
Training Compute-Optimal LLMs - 2022
RLHF* (InstructGPT->ChatGPT) - 2022
DPO (No need for RL/Reward model) - 2023
LLM-as-Judge (On par with human evaluations) - 2023
MoE (MIxture of Experts) - 2024
Planning/Reasoning:
AlphaZero/MuZero* (RL without prior knowledge of game or rules) - 2017/2019
CoT* (Chain of Thought)/ToT (Tree of Thoughts)/GoT (Graph of Thoughts) - 2022/2023/2023
ReACT (Generate reasoning traces and task-specific actions in interleaved manner) - 2022
Let’s Verify Step by Step (Process > Outcome) - 2023
ARC-Prize* (Latest methods for solving ARC-AGI problems) - 2024
Scaling Test-Time Compute (Relationship between inference-time and pre-training compute) - 2024
Applications:
Toolformer (LLMs to use tools) - 2023
GPT4 (Overview of GPT4, but fairly high level) - 2023
Llama3* (In depth details of how Meta built Llama3 and the various configurations and hyperparameters) - 2024
Gemini1.5 (Multimodal across 10MM context window) - 2024
Deepseekv3 (Building a frontier OSS model at a fraction of the cost of everyone else) - 2024
SWE-Agent/OpenHands (OpenSource software development agents) - 2024
Benchmarks:
SWE-Bench (Real world software development) - 2023
Chatbot Arena (Live human preference Elo ratings) - 2024
Videos/Lectures:
3Blue1Brown on Foundational Math/Concepts
Build a Large Language Model (from Scratch) #1 Bestseller
Andrej Kaparthy: Zero to Hero Series
Yannic Kilcher Paper Explanations
Noam Brown (o1 founder) on Planning in AI
Stanford: Building LLMs
Why You’re Not Too Old to Pivot Into AI (motivation)
Helpful Websites:
Full Stack Deep Learning - courses for building AI products
Prompting Guide - extensive list of prompting techniques and examples
a16z AI Cannon - similar list of resources, but longer (slightly dated)
2025 AI Engineer Reading List - longer reading list, broken out by focus area
State of Generative Models 2024 - good simple summary of current state
Others (non LLMs):
Vision Transformer (no need for CNNs) - 2021
Latent Diffusion (Text-to-Image) - 2021
Obvious/easy papers (to get your feet wet if you're new to papers):
CoT (Chain of Thought) - 2022
SELF-REFINE: Iterative Refinement with Self-Feedback - 2023