Skip to content

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

Notifications You must be signed in to change notification settings

rkinas/rlhf_thinking_model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 

Repository files navigation

Thinking Model and RLHF Research Notes

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

Repository Contents

Reinforcement Learning and RLHF Overview

A curated list of materials providing an introduction to RL and RLHF:

  • Research papers and books covering key concepts in reinforcement learning.
  • Video lectures explaining the fundamentals of RLHF.

Methods for LLM Training

An extensive collection of state-of-the-art approaches for optimizing preferences and model alignment:

  • Key techniques such as PPO, DPO, KTO, ORPO, and more.
  • The latest ArXiv publications and publicly available implementations.
  • Analysis of effectiveness across different optimization strategies.

Purpose of this Repository

This repository is designed as a reference for researchers and engineers working on reinforcement learning and large language models. If you're interested in model alignment, experiments with DPO and its variants, or alternative RL-based methods, you will find valuable resources here.

RL overview

Methods for LLM training

Minimal implementation

Method
DPO

Tutorials

Notes for learning RL: Value Iteration -> Q Learning -> DQN -> REINFORCE -> Policy Gradient Theorem -> TRPO -> PPO

RLHF training techniques explained

Training frameworks

RLHF methods implementation (only with detailed explanations)

Articles

Thinking process

Repos

Articles

Papers

Open-source project to reproduce DeepSeek R1

Datasets - thinking models

Evaluation and benchmarks

About

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages