Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Error when wrapping the step #1020

Closed
sdiazlor opened this issue Oct 7, 2024 · 4 comments · Fixed by #1022
Closed

[BUG] Error when wrapping the step #1020

sdiazlor opened this issue Oct 7, 2024 · 4 comments · Fixed by #1022
Labels
bug Something isn't working
Milestone

Comments

@sdiazlor
Copy link
Contributor

sdiazlor commented Oct 7, 2024

Describe the bug
A clear and concise description of what the bug is.
When working on this notebook (#949), I encountered that with the dev branch, the task wasn't working.

Maybe related to this one too: #991

To Reproduce
Code to reproduce

labels_topic= ["World", "Sports", "Sci/Tech", "Business"]
labels_fact_opinion = ["Fact-based", "Opinion-based"]

task_templates = [
    "Determine the news article as {}",
    "Classify news article as {}",
    "Identify the news article as {}",
    "Categorize the news article as {}",
    "Label the news article using {}",
    "Annotate the news article based on {}",
    "Determine the theme of a news article from {}",
    "Recognize the topic of the news article as {}",
]

classification_tasks = [
    {"task": action.format(" or ".join(random.sample(labels_topic, 2)))}
    for action in task_templates for _ in range(4)
] + [
    {"task": action.format(" or ".join(random.sample(labels_fact_opinion, 2)))}
    for action in task_templates
]

difficulties = ["college", "high school", "PhD"]
clarity = ["clear", "understandable with some effort", "ambiguous"]

with Pipeline("texcat-generation-pipeline") as pipeline:

    tasks_generator = LoadDataFromDicts(data=classification_tasks)

    generate_data = []
    for difficulty in difficulties:
        for clarity_level in clarity:
            task = GenerateTextClassificationData(
                language="English",
                difficulty=difficulty,
                clarity=clarity_level,
                num_generations=2,
                llm=InferenceEndpointsLLM(
                    model_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
                    tokenizer_id="meta-llama/Meta-Llama-3.1-8B-Instruct",
                    generation_kwargs={"max_new_tokens": 512, "temperature": 0.7},
                ),
                input_batch_size=5,
            )
            generate_data.append(task)

    for task in generate_data:
        tasks_generator.connect(task)

Expected behaviour
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.
Screenshot 2024-09-25 at 12 20 10

Desktop (please complete the following information):

  • Package version: 1.4dev
  • Python version: 3.11.4

Additional context
Add any other context about the problem here.
Note: Already shared on Slack, just realized I didn't create an issue

@sdiazlor sdiazlor added the bug Something isn't working label Oct 7, 2024
@sdiazlor sdiazlor added this to the 1.4.0 milestone Oct 7, 2024
@plaguss
Copy link
Contributor

plaguss commented Oct 7, 2024

Hi @sdiazlor, can you share the classification_tasks data to run the example?

@sdiazlor
Copy link
Contributor Author

sdiazlor commented Oct 8, 2024

@plaguss thanks, I updated the example code. With the 1.3.2 works well.

@plaguss
Copy link
Contributor

plaguss commented Oct 8, 2024

Can you test with the fix and see if it works?

@sdiazlor
Copy link
Contributor Author

sdiazlor commented Oct 8, 2024

It works now, thanks for tackling this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants