Skip to content

A log of things I'm learning

License

Notifications You must be signed in to change notification settings

amitness/learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 

Repository files navigation

learning

A running log of things I'm learning to build strong core software engineering skills while also expanding my knowledge of adjacent technologies a little bit everyday.

Updated: Once a month | Current Focus: Generative AI

Core Skills

Generic skills that are transferrable to any sort of software work I do

Python Programming

Resource Progress
Datacamp: Writing Efficient Python Code
Datacamp: Writing Functions in Python
Datacamp: Object-Oriented Programming in Python
Datacamp: Intermediate Object-Oriented Programming in Python
Datacamp: Importing Data in Python (Part 1)
Datacamp: Importing Data in Python (Part 2)
Datacamp: Intermediate Python for Data Science
Datacamp: Python Data Science Toolbox (Part 1)
Datacamp: Python Data Science Toolbox (Part 2)
Datacamp: Developing Python Packages
Datacamp: Conda Essentials
Youtube: Tutorial: Sebastian Witowski - Modern Python Developer's Toolkit
Datacamp: Working with Dates and Times in Python
Datacamp: Command Line Automation in Python
Book: Python 201
Book: Writing Idiomatic Python 3
Article: Python's many command-line utilities
Article: A Programmer’s Introduction to Unicode

Testing & Profiling

Resource Progress
Datacamp: Unit Testing for Data Science in Python
Book: Test Driven Development with Python
Article: Introduction to Memory Profiling in Python
Article: Profiling Python code with memory_profiler
Article: How to Use "memory_profiler" to Profile Memory Usage by Python Code?

Data Structures and Algorithms

Resource Progress
Book: Grokking Algorithms
Book: The Tech Resume Inside Out
Neetcode: Algorithms and Data Structures for Beginners
Udacity: Intro to Data Structures and Algorithms

Linux & Command Line

Resource Progress
Datacamp: Introduction to Shell for Data Science
Datacamp: Introduction to Bash Scripting
Datacamp: Data Processing in Shell
MIT: The Missing Semester
Udacity: Linux Command Line Basics
Udacity: Shell Workshop
Udacity: Configuring Linux Web Servers

Version Control

Resource Progress
Udacity: Version Control with Git
Datacamp: Introduction to Git for Data Science
Udacity: GitHub & Collaboration
Udacity: How to Use Git and GitHub

Databases

Resource Progress
Udacity: Intro to relational database
Udacity: Database Systems Concepts & Design
Datacamp: Database Design
Datacamp: Introduction to Databases in Python
Datacamp: Intro to SQL for Data Science
Datacamp: Intermediate SQL
Datacamp: Joining Data in PostgreSQL
Udacity: SQL for Data Analysis
Datacamp: Exploratory Data Analysis in SQL
Datacamp: Applying SQL to Real-World Problems
Datacamp: Analyzing Business Data in SQL
Datacamp: Reporting in SQL
Datacamp: Data-Driven Decision Making in SQL
Datacamp: NoSQL Concepts
Datacamp: Introduction to MongoDB in Python

Backend Engineering

Resource Progress
Udacity: Authentication & Authorization: OAuth
Udacity: HTTP & Web Servers
Udacity: Client-Server Communication
Udacity: Designing RESTful APIs
Datacamp: Introduction to APIs in Python
Udacity: Networking for Web Developers

Production System Design

Resource Progress
Book: Designing Machine Learning Systems
Neetcode: System Design for Beginners
Neetcode: System Design Interview
Datacamp: Customer Analytics & A/B Testing in Python
Datacamp: A/B Testing in Python
Udacity: A/B Testing
Datacamp: MLOps Concepts
Datacamp: Machine Learning Monitoring Concepts

Maths

Resource Progress
Datacamp: Foundations of Probability in Python
Datacamp: Introduction to Statistics
Datacamp: Introduction to Statistics in Python
Datacamp: Hypothesis Testing in Python
Datacamp: Statistical Thinking in Python (Part 1)
Datacamp: Statistical Thinking in Python (Part 2)
Datacamp: Experimental Design in Python
Datacamp: Practicing Statistics Interview Questions in Python
edX: Essential Statistics for Data Analysis using Excel
Udacity: Intro to Inferential Statistics
MIT 18.06 Linear Algebra, Spring 2005
Udacity: Eigenvectors and Eigenvalues
Udacity: Linear Algebra Refresher
Youtube: Essence of linear algebra

Specialization


Traditional Machine Learning

Resource Progress
Book: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition
Book: A Machine Learning Primer
Book: Grokking Machine Learning
Book: The StatQuest Illustrated Guide To Machine Learning
Datacamp: Ensemble Methods in Python
Datacamp: Extreme Gradient Boosting with XGBoost
Datacamp: Clustering Methods with SciPy
Datacamp: Unsupervised Learning in Python
Udacity: Segmentation and Clustering
Datacamp: Intro to Python for Data Science
edX: Implementing Predictive Analytics with Spark in Azure HDInsight
Datacamp: Supervised Learning with scikit-learn
Datacamp: Machine Learning with Tree-Based Models in Python
Datacamp: Linear Classifiers in Python
Datacamp: Model Validation in Python
Datacamp: Hyperparameter Tuning in Python
Datacamp: HR Analytics in Python: Predicting Employee Churn
Datacamp: Predicting Customer Churn in Python
Datacamp: Dimensionality Reduction in Python
Datacamp: Preprocessing for Machine Learning in Python
Datacamp: Data Types for Data Science
Datacamp: Cleaning Data in Python
Datacamp: Feature Engineering for Machine Learning in Python
Datacamp: Predicting CTR with Machine Learning in Python
Datacamp: Intro to Financial Concepts using Python
Datacamp: Fraud Detection in Python

Deep Learning

Resource Progress
Article: An overview of gradient descent optimization algorithms
Book: Make Your Own Neural Network
Fast.ai: Practical Deep Learning for Coder (Part 1)
Fast.ai: Practical Deep Learning for Coder (Part 2)
Datacamp: Convolutional Neural Networks for Image Processing
Karpathy: Neural Networks: Zero to Hero
Article: Weight Initialization in Neural Networks: A Journey From the Basics to Kaiming

Natural Language Processing

Resource Progress
Book: Natural Language Processing with Transformers
Stanford CS224U: Natural Language Understanding | Spring 2019
Stanford CS224N: Stanford CS224N: NLP with Deep Learning | Winter 2019
CMU: Low-resource NLP Bootcamp 2020
CMU Multilingual NLP 2020
Datacamp: Feature Engineering for NLP in Python
Datacamp: Natural Language Processing Fundamentals in Python
Datacamp: Regular Expressions in Python
Datacamp: RNN for Language Modeling
Datacamp: Natural Language Generation in Python
Datacamp: Building Chatbots in Python
Datacamp: Sentiment Analysis in Python
Datacamp: Machine Translation in Python
Article: The Unreasonable Effectiveness of Collocations
Article: FuzzyWuzzy: Fuzzy String Matching in Python
Article: Mamba Explained
Article: A Visual Guide to Mamba and State Space Models
Article: Transformers: Origins

Generative AI


LLM Theory

Resource Progress
Book: Hands-On Large Language Models: Language Understanding and Generation
Book: AI Engineering: Building Applications with Foundation Models
Book: Designing Large Language Model Applications
Book: Large Language Models: A Deep Dive: Bridging Theory and Practice
Article: From Digits to Decisions: How Tokenization Impacts Arithmetic in LLMs
Article: SolidGoldMagikarp (plus, prompt generation)
DeepLearning.AI: Pretraining LLMs
Karpathy: Intro to Large Language Models 1hr
Karpathy: Let's build the GPT Tokenizer 2hr13m
Karpathy: Let's reproduce GPT-2 (124M) 4hr1m
Youtube: A Hackers' Guide to Language Models 1hr30m
Youtube: 5 Years of GPTs with Finbarr Timbers 55m
Youtube: Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)
Article: Sampling for Text Generation
DeepLearning.AI: Reinforcement Learning from Human Feedback
Youtube: LLaMA explained: KV-Cache, Rotary Positional Embedding, RMS Norm, Grouped Query Attention, SwiGLU 1h10m
Youtube: CMU Advanced NLP Fall 2024 (7): Prompting and Complex Reasoning
Youtube: CMU Advanced NLP Fall 2024 (6): Instruction Tuning
Youtube: CMU Advanced NLP Fall 2024 (12): Domain Specific Modeling: Code and Math
Youtube: CMU Advanced NLP Fall 2024 (15): Tool Use and LLM Agent Basics
Youtube: CMU Advanced NLP Fall 2024 (14): Ensembling and Mixture of Experts

Multi-modality

Resource Progress
Article: Understanding Multimodal LLMs
Article: GPT-4 Vision Alternatives
Youtube: AI Visions Live | Merve Noyan | Open-source Multimodality 54m
DeepLearning.AI: Building Multimodal Search and RAG
DeepLearning.AI: Prompt Engineering for Vision Models
DeepLearning.AI: How Diffusion Models Work

Information Retrieval / RAG

Resource Progress
Article: Pretrained Transformer Language Models for Search - part 1
Article: Pretrained Transformer Language Models for Search - part 2
Article: Pretrained Transformer Language Models for Search - part 3
Article: Pretrained Transformer Language Models for Search - part 4
Article: How not to use BERT for Document Ranking
Article: Understanding LanceDB's IVF-PQ index
Article: A little pooling goes a long way for multi-vector representations
Article: Levels of Complexity: RAG Applications
Article: Systematically Improving Your RAG
Article: Stop using LGTM@Few as a metric (Better RAG)
Article: Low-Hanging Fruit for RAG Search
Article: What AI Engineers Should Know about Search
Article: Evaluating Chunking Strategies for Retrieval
Article: Sentence Embeddings. Introduction to Sentence Embeddings
Article: LambdaMART in Depth
Article: Guided Generation with Outlines
Course: Fullstack Retrieval
DeepLearning.AI: Building and Evaluating Advanced RAG Applications
DeepLearning.AI: Vector Databases: from Embeddings to Applications
DeepLearning.AI: Advanced Retrieval for AI with Chroma
DeepLearning.AI: Prompt Compression and Query Optimization
DeepLearning.AI: Large Language Models with Semantic Search 1hr
DeepLearning.AI: Building Applications with Vector Databases
DeepLearning.AI: Knowledge Graphs for RAG
DeepLearning.AI: Preprocessing Unstructured Data for LLM Applications
DeepLearning.AI: Embedding Models: From Architecture to Implementation
DeepLearning.AI: Retrieval Optimization - From Tokenization to Vector Quantization
Pinecone: Vector Databases in Production for Busy Engineers 0/6
Pinecone: Retrieval Augmented Generation 0/3
Pinecone: LangChain AI Handbook 0/11
Pinecone: Embedding Methods for Image Search 0/8
Pinecone: Faiss: The Missing Manual 0/7
Pinecone: Vector Search in the Wild 0/4
Pinecone: Natural Language Processing for Semantic Search 0/13
Youtube: Systematically improving RAG applications
Youtube: Back to Basics for RAG w/ Jo Bergum
Youtube: Beyond the Basics of Retrieval for Augmenting Generation (w/ Ben Clavié)
Youtube: RAG From Scratch 0/14
Youtube: CMU Advanced NLP Fall 2024 (10): Retrieval and RAG 1h17m
Guidance: Token Healing

Agentic Pattern

Resource Progress
Article: Tool Invocation - Demonstrating the Marvel of GPT's Flexibility
Anthropic: Building effective agents
OpenAI: Assistants & Agents Build Hour
OpenAI: Function Calling Build Hour
DeepLearning.AI: Functions, Tools and Agents with LangChain
DeepLearning.AI: Building Agentic RAG with LlamaIndex
DeepLearning.AI: Multi AI Agent Systems with crewAI
DeepLearning.AI: AI Agentic Design Patterns with AutoGen
DeepLearning.AI: AI Agents in LangGraph
DeepLearning.AI: Building Your Own Database Agent
DeepLearning.AI: Function-Calling and Data Extraction with LLMs

Prompt Engineering

Resource Progress
Article: OpenAI Prompt Engineering
Article: Prompting Fundamentals and How to Apply them Effectively
Article: How I came in first on ARC-AGI-Pub using Sonnet 3.5 with Evolutionary Test-time Compute
Anthropic Courses
Anthropic: The Claude in Amazon Bedrock Course
Article: Prompt Engineering(Liliang Weng)
Article: Prompt Engineering 201: Advanced methods and toolkits
Article: Optimizing LLMs for accuracy
Article: Primers • Prompt Engineering
Article: Anyscale Endpoints: JSON Mode and Function calling Features
Article: Guided text generation with Large Language Models
Book: Prompt Engineering for LLMs
DeepLearning.AI: Reasoning with o1
OpenAI: Reasoning with o1 Build Hour
DeepLearning.AI: ChatGPT Prompt Engineering for Developers
DeepLearning.AI: Prompt Engineering with Llama 2 & 3
Wandb: LLM Engineering: Structured Outputs
Series: Prompt injection
Youtube: Prompt Engineering Overview 1hr4m
Youtube: Prompt Engineering Workshop 1h

Quantization

Resource Progress
Article: Quantization Fundamentals with Hugging Face
DeepLearning.AI: Quantization in Depth
DeepLearning.AI: Introduction to On-Device AI
Article: A Visual Guide to Quantization
Article: QLoRA and 4-bit Quantization
Article: Understanding AI/LLM Quantisation Through Interactive Visualisations
Youtube: CMU Advanced NLP Fall 2024 (11): Distillation, Quantization, and Pruning

Inference Optimization

Resource Progress
Article: How to make LLMs go fast
Article: In the Fast Lane! Speculative Decoding - 10x Larger Model, No Extra Cost
Article: Accelerating Generative AI with PyTorch II: GPT, Fast
Article: Harmonizing Multi-GPUs: Efficient Scaling of LLM Inference
Article: Multi-Query Attention is All You Need
Article: Transformers Inference Optimization Toolset
DeepLearning.AI: Efficiently Serving LLMs
Article: LLM Inference Series: 3. KV caching explained
Article: LLM Inference Series: 4. KV caching, a deeper look
Article: LLM Inference Series: 5. Dissecting model performance
Article: Transformer Inference Arithmetic
Youtube: SBTB 2023: Charles Frye, Parallel Processors: Past & Future Connections Between LLMs and OS Kernels
Youtube: Deploying Fine-Tuned Models 2h28m

Evals and Guardrails

Resource Progress
Article: Your AI Product Needs Evals
Article: Task-Specific LLM Evals that Do & Don't Work
Article: Evaluation & Hallucination Detection for Abstractive Summaries
Article: LLM-as-a-Judge vs Human Evaluation
DeepLearning.AI: Automated Testing for LLMOps
DeepLearning.AI: Red Teaming LLM Applications
DeepLearning.AI: Evaluating and Debugging Generative AI Models Using Weights and Biases
DeepLearning.AI: Quality and Safety for LLM Applications
OpenAI: Evals Build Hour
Youtube: Instrumenting & Evaluating LLMs 2hr33m
Youtube: LLM Eval For Text2SQL 51m
Youtube: A Deep Dive on LLM Evaluation 49m

Finetuning and Distillation

Resource Progress
Article: Tokenization Gotchas
Article: Practical Tips for Finetuning LLMs Using LoRA (Low-Rank Adaptation)
OpenAI: GPT-4o mini Fine-Tuning Build Hour
OpenAI: Distillation Build Hour
Article: How to Generate and Use Synthetic Data for Finetuning
DeepLearning.AI: Finetuning Large Language Models
Youtube: Fine-Tuning with Axolotl 2h10m
Youtube: Creating, Curating, and Cleaning Data for LLMs 54m
Youtube: Best Practices For Fine Tuning Mistral 23m
Youtube: Fine Tuning OpenAI Models - Best Practices
Youtube: When and Why to Fine Tune an LLM 1h56m
Youtube: Slaying OOMs with PyTorch FSDP and torchao 49m
Youtube: Napkin Math For Fine Tuning Pt. 1 w/Johno Whitaker
Youtube: Napkin Math For Fine Tuning Pt. 2 w/Johno Whitaker
Youtube: Fine Tuning LLMs for Function Calling w/Pawel Garback 1h32m
Youtube: From Prompt to Model: Fine-tuning when you've already deployed LLMs in prod w/Kyle Corbitt 32m
Youtube: Why Fine Tuning is Dead w/Emmanuel Ameisen 50m
Benchmarking QLoRA+FSDP

LLM System Design

Resource Progress
Article: What We’ve Learned From A Year of Building with LLMs
Article: Data Flywheels for LLM Applications
Article: LLM From the Trenches: 10 Lessons Learned Operationalizing Models at GoDaddy
Article: Emerging UX Patterns for Generative AI Apps & Copilots
Article: The Novice's LLM Training Guide
Article: Pushing ChatGPT's Structured Data Support To Its Limits
Article: GPTed: using GPT-3 for semantic prose-checking
Article: Don't worry about LLMs
DeepLearning.AI: Building Systems with the ChatGPT API
DeepLearning.AI: LangChain for LLM Application Development
DeepLearning.AI: LangChain: Chat with Your Data
DeepLearning.AI: Building Generative AI Applications with Gradio
DeepLearning.AI: Open Source Models with Hugging Face
DeepLearning.AI: Getting Started with Mistral
Datacamp: Developing LLM Applications with LangChain
LLMOps: Building with LLMs
LLM Bootcamp - Spring 2023
Youtube: A Survey of Techniques for Maximizing LLM Performance
Youtube: Building Blocks for LLM Systems & Products: Eugene Yan
Youtube: Building LLM Applications 0/8
Article: Emerging Architectures for LLM Applications
Article: Patterns for Building LLM-based Systems & Products
DeepLearning.AI: LLMOps
DeepLearning.AI: Serverless LLM apps with Amazon Bedrock
Youtube: Getting the Most Out of Your LLM Experiments 48m

Technical Skills (Libraries/Frameworks/Tools)

AWS

Resource Progress
Udemy: AWS Certified Developer - Associate 2018

CSS

Resource Progress
Pluralsight: CSS Positioning
Pluralsight: Introduction to CSS
Pluralsight: CSS: Specificity, the Box Model, and Best Practices
Pluralsight: CSS: Using Flexbox for Layout
Code School: Blasting Off with Bootstrap
Pluralsight: UX Fundamentals
Codecademy: Learn SASS
CSS for Javascript Developers
Article: Create an illustration in Figma design
Book: Refactoring UI
Youtube: How to Make Your Website Not Ugly: Basic UX for Programmers 48m

Django

Resource Progress
Article: Django, HTMX and Alpine.js: Modern websites, JavaScript optional

HTML

Resource Progress
Codecademy: Learn HTML
Codecademy: Make a website
Article: Alternative Text

JavaScript

Resource Progress
Udacity: ES6 - JavaScript Improved
Udacity: Intro to Javascript
Udacity: Object Oriented JS 1
Udacity: Object Oriented JS 2
Udemy: Understanding Typescript
Codecademy: Learn JavaScript
Codecademy: Jquery Track
Pluralsight: Using The Chrome Developer Tools

Matplotlib

Resource Progress
Datacamp: Introduction to Seaborn
Datacamp: Introduction to Matplotlib

MLFlow

Resource Progress
Datacamp: Introduction to MLFlow

Nexxt.JS

Resource Progress
Docs: Start building with Next.js

Pandas

Resource Progress
Datacamp: Pandas Foundations
Datacamp: Pandas Joins for Spreadsheet Users
Datacamp: Manipulating DataFrames with pandas
Datacamp: Merging DataFrames with pandas
Datacamp: Data Manipulation with pandas
Datacamp: Optimizing Python Code with pandas
Datacamp: Streamlined Data Ingestion with pandas
Datacamp: Analyzing Marketing Campaigns with pandas
Datacamp: Analyzing Police Activity with pandas

PyTorch

Resource Progress
Article: PyTorch internals
Article: Taking PyTorch For Granted
Datacamp: Introduction to Deep Learning with PyTorch
Datacamp: Intermediate Deep Learning with PyTorch
Datacamp: Deep Learning for Text with PyTorch
Datacamp: Deep Learning for Images with PyTorch
Deeplizard: Neural Network Programming - Deep Learning with PyTorch

ReactJS

Resource Progress
Codecademy: Learn ReactJS: Part I
Codecademy: Learn ReactJS: Part II
NexxtJS: React Foundations

Spacy

Resource Progress
Datacamp: Advanced NLP with spaCy

Tensorflow & Keras

Resource Progress
Datacamp: Introduction to TensorFlow in Python
Datacamp: Deep Learning in Python
Datacamp: Introduction to Deep Learning with Keras
Datacamp: Advanced Deep Learning with Keras
Deeplizard: Keras - Python Deep Learning Neural Network API
Udacity: Intro to TensorFlow for Deep Learning

VSCode

Resource Progress
VSCode Docs: Python Interactive window

Misc

Resource Progress
Google: Technical Writing Course