Add tasks to replicate Math-shepherd #1052

plaguss · 2024-11-06T15:58:27Z

Description

This task Integrates the tasks to replicate:
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations

Examples:

Dataset generated with ray: https://huggingface.co/datasets/plaguss/test_math_shepherd_prm_ray
Generated with inference endpoints: https://huggingface.co/datasets/plaguss/test_math_shepherd_all_structured

It integrates structured outputs as I found while testing that it highly improves the success rate of the generations.

Example pipeline:

from datasets import load_dataset

from distilabel.steps.tasks.math_shepherd.generator import MathShepherdGenerator
from distilabel.steps.tasks.math_shepherd.completer import MathShepherdCompleter
from distilabel.steps.tasks.math_shepherd.utils import FormatPRM
from distilabel.models import InferenceEndpointsLLM
from distilabel.pipeline import Pipeline
from distilabel.steps import CombineOutputs, ExpandColumns

ds_name = "openai/gsm8k"

ds = load_dataset(ds_name, "main", split="test").rename_column("question", "instruction").select(range(3))


with Pipeline(name="Math-Shepherd") as pipe:
    model_id_70B = "meta-llama/Meta-Llama-3.1-70B-Instruct"
    model_id_8B = "meta-llama/Meta-Llama-3.1-8B-Instruct"

    llm_70B = InferenceEndpointsLLM(
        model_id=model_id_8B,
        tokenizer_id=model_id_8B,
        generation_kwargs={"max_new_tokens": 1024, "temperature": 0.6},
    )
    llm_8B = InferenceEndpointsLLM(
        model_id=model_id_8B,
        tokenizer_id=model_id_8B,
        generation_kwargs={"max_new_tokens": 2048, "temperature": 0.6},
    )

    generator_golden = MathShepherdGenerator(
        name="golden_generator",
        llm=llm_70B,
    )
    generator = MathShepherdGenerator(
        name="generator",
        llm=llm_8B,
        M=5  # Generate 5 sample solutions
    )
    completer = MathShepherdCompleter(
        name="completer",
        llm=llm_8B,
        N=4  # Each solution will be tested with 4 completions during labelling
    )

    combine = CombineOutputs()
    expand = ExpandColumns(
        name="expand_columns",
        columns=["solutions"],
        split_statistics=True,
    )
    formatter = FormatPRM(name="format_prm", format="trl")
    [generator_golden, generator] >> combine >> completer >> expand >> formatter


if __name__ == "__main__":
    distiset = pipe.run(use_cache=False, dataset=ds)
    distiset.push_to_hub("plaguss/test_math_shepherd_prm")

A sample dataset can be seen at plaguss/test_math_shepherd_prm

…to math-shepherd

github-actions · 2024-11-06T15:59:53Z

Documentation for this PR has been built. You can view it at: https://distilabel.argilla.io/pr-1052/

codspeed-hq · 2024-11-06T16:07:15Z

CodSpeed Performance Report

Merging #1052 will not alter performance

_{Comparing math-shepherd (750f8d6) with develop (f8e41cd)}

Summary

✅ 1 untouched benchmarks

docs/sections/pipeline_samples/papers/math_shepherd.md

src/distilabel/steps/tasks/math_shepherd/generator.py

src/distilabel/steps/tasks/math_shepherd/utils.py

src/distilabel/steps/tasks/math_shepherd/completer.py

Co-authored-by: Gabriel Martín Blázquez <[email protected]>

…el metadata

…ate to work with the statistics from the llm

…data

…to math-shepherd

plaguss added 8 commits October 25, 2024 17:28

Add draft for math-shepherd generator

9757b5a

First draft of step by step generator

83ab0ea

Merge branch 'develop' of https://github.com/argilla-io/distilabel in…

b7c2df5

…to math-shepherd

First working version of the math shepherd generator

ee10a6f

Add helper function to parse the solutions

5f5d823

Add passing tests

3fbb18b

First version of completer working decently enough

80a76b3

Initial version of the generator

6d63567

plaguss added the enhancement New feature or request label Nov 6, 2024

plaguss added this to the 1.5.0 milestone Nov 6, 2024

plaguss self-assigned this Nov 6, 2024

plaguss added 11 commits November 11, 2024 12:50

Update prompt to be similar across generator and completer

9e3f1ce

Add an example of how to implement the math-shepherd recipe

1454218

Add example pipeline

6353ecf

Include the implementation as a paper in the docs

04d8241

Add the label category for the completer

3a24aa2

Add docs and redirect imports

ed0d351

Update ExpandColumns to allow decoding json encoded lists

619ee8f

Add FormatPRM step to prepare the data for training

2bee8e0

Update example with FormatPRM

a7be8bb

Add tutorial to reproduce Math-Shepherd

fb71c63

Redirect import

3ca7b7d

plaguss marked this pull request as ready for review November 12, 2024 12:02

plaguss requested a review from gabrielmbmb November 12, 2024 12:03

gabrielmbmb approved these changes Nov 15, 2024

View reviewed changes

plaguss and others added 3 commits November 18, 2024 16:17

Update docs/sections/pipeline_samples/papers/math_shepherd.md

986f281

Co-authored-by: Gabriel Martín Blázquez <[email protected]>

Add comment per code review

7a63129

Update src/distilabel/steps/tasks/math_shepherd/utils.py

02e38fc

Co-authored-by: Gabriel Martín Blázquez <[email protected]>

plaguss added 28 commits November 25, 2024 12:44

Return list instead of json encoded list

9011ad1

Update function to deal with the new output generated by the LLMs

3bb074d

Update docs with the Expand updated

b8267e3

Add a new argument to expand columns to account the split of distilab…

75638c2

…el metadata

Update the code to return a list instead of json encoded list and upd…

ab3b624

…ate to work with the statistics from the llm

Fix possible missing data in serialization

d6e7ee3

Update argilla test

f272e71

Extra control for unexpected values in distilabel_metadata statistics

5ee6115

Fix update of distilabel metadata

9d661ce

Add metadata to the completer and take care of not removing previous …

4145f80

…data

Add structured generation to the math shepherd generator

5cff76d

Add tests for structured generation

122d597

Add safeguard for not found solutions

fe4ad1a

Add extra controls to prevent errors with the generator

81b5edb

Add control to generator outputs

0fc3bf7

Fix error with format_output

fa4328f

Add TRL format

7a137af

Add structured output to the completer

ee0f0f1

Add extra control on the formatter

a02e9eb

Let statistics in the completer with the same format as the outputs

acd0d24

Remove comment

b283934

Add extra control on the completers metadata

72e26cc

Fix types and default value on error

faa08a1

Update the docs with the new pipeline version

1883abe

Fix extra types and add new example

4329214

Fix refactor on variable names

171975c

Merge branch 'develop' of https://github.com/argilla-io/distilabel in…

92a44bd

…to math-shepherd

Add noqa to pass CI tests

750f8d6

plaguss merged commit 6bb61d1 into develop Dec 4, 2024
8 checks passed

plaguss deleted the math-shepherd branch December 4, 2024 10:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tasks to replicate Math-shepherd #1052

Add tasks to replicate Math-shepherd #1052

plaguss commented Nov 6, 2024 •

edited

Loading

github-actions bot commented Nov 6, 2024

codspeed-hq bot commented Nov 6, 2024 •

edited

Loading

Add tasks to replicate Math-shepherd #1052

Add tasks to replicate Math-shepherd #1052

Conversation

plaguss commented Nov 6, 2024 • edited Loading

Description

github-actions bot commented Nov 6, 2024

codspeed-hq bot commented Nov 6, 2024 • edited Loading

CodSpeed Performance Report

Merging #1052 will not alter performance

Summary

plaguss commented Nov 6, 2024 •

edited

Loading

codspeed-hq bot commented Nov 6, 2024 •

edited

Loading