Skip to content

Latest commit

 

History

History
131 lines (108 loc) · 9.91 KB

README.md

File metadata and controls

131 lines (108 loc) · 9.91 KB

Thinking Model and RLHF Research Notes

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

Repository Contents

Reinforcement Learning and RLHF Overview

A curated list of materials providing an introduction to RL and RLHF:

  • Research papers and books covering key concepts in reinforcement learning.
  • Video lectures explaining the fundamentals of RLHF.

Methods for LLM Training

An extensive collection of state-of-the-art approaches for optimizing preferences and model alignment:

  • Key techniques such as PPO, DPO, KTO, ORPO, and more.
  • The latest ArXiv publications and publicly available implementations.
  • Analysis of effectiveness across different optimization strategies.

Purpose of this Repository

This repository is designed as a reference for researchers and engineers working on reinforcement learning and large language models. If you're interested in model alignment, experiments with DPO and its variants, or alternative RL-based methods, you will find valuable resources here.

RL overview

Methods for LLM training

Minimal implementation

Method
DPO

Tutorials

Notes for learning RL: Value Iteration -> Q Learning -> DQN -> REINFORCE -> Policy Gradient Theorem -> TRPO -> PPO

RLHF training techniques explained

Training frameworks

RLHF methods implementation (only with detailed explanations)

Articles

Thinking process

Repos

Articles

Papers

Open-source project to reproduce DeepSeek R1

Datasets - thinking models

Evaluation and benchmarks