Skip to content

A repository of information about data used in training large language models (LLMs)

Notifications You must be signed in to change notification settings

kibitzing/awesome-llm-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 

Repository files navigation

awesome-llm-data

A repository of information about data used in training large language models (LLMs)

Models

LLaMa 2

GPT

Safety evaluation dataset

Bias:

  • Bold
    • Used by: Llama 2

Truthfulness:

Toxicity:

Pre-trained Model performance evaluation dataset

Code

Commonsense reasoning

World knowledge

Reading comprehension

Math

  • GSM8K
    • Used by: Llama 2
  • MATH
    • Used by: Llama 2

Popular aggregated benchmarks

Other repositories

About

A repository of information about data used in training large language models (LLMs)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published