Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filters don't pass to label stream and Random Sampling mode not respected #6617

Closed
CarlosNacher opened this issue Nov 8, 2024 · 2 comments
Closed

Comments

@CarlosNacher
Copy link

CarlosNacher commented Nov 8, 2024

Describe the bug
The same bug that is supposed to be resolved in version 1.14.0 (release notes). But it still happens in version 1.14.0 to me.

To Reproduce
Steps to reproduce the behavior:

  1. Apply some filters in the Data Manager view
  2. Press "Label all tasks" button
  3. LS will show you tasks that are outside the filter

Or, a very esasy way to check this:

  1. Apply some filters that let you 0 tasks in the Data Manager (restrictive filters that no one task match)
  2. Press "Label All tasks" button. You expect to be prompted with the message (No tasks in queue), but intead, LS starts showing you tasks (outside the filters).

Expected behavior
That in step 3 (above) LS only shows you the tasks that match the filters

Environment:

  • OS: Windows 10 Pro
  • Label Studio Version 1.14.0
  • Web browser: Google Chrome version 130.0.6723.117 (64 bits)

Additional context

Another bug if instead of pressing "Label All tasks" button you mark the checkbox and press "Label N tasks" button (where N si the number of tasks that match the filter thus Data Manager shows to you), it seems filters are now being applied but if your project is configured as "Random Sampling" mode, it is ot respected.

@CarlosNacher CarlosNacher changed the title Filters don't pass to label stream Filters don't pass to label stream and Random Sampling mode not respected Nov 8, 2024
@makseq
Copy link
Member

makseq commented Nov 8, 2024

Just linking the PR here, so if anyone else has issues they can following along:
#6410

  1. The Label All Tasks button refers to all tasks in the project. So, the behavior when you click 'Label All Tasks' seems to work as it was designed. However, I understand your confusion; this behavior requires looking at it from a different angle.

  2. The Label N Tasks button refers to tasks in your filtered subset + (manually selected tasks OR all tasks if all tasks checkbox on the top left near ID is selected).

Originally, we had a bug when all tasks checkbox was selected. Filters didn't pass to the labeling stream and you saw all tasks in the project, instead of seeing all tasks all tasks in the filtered subset.
So, you have to use Label N Tasks with the all tasks checkbox selected

  1. Regarding your question about sampling + Label N Tasks mode.
    Unfortunately, LS doesn't support this mix of filters and the random distribution. However, you can simulate random distribution by adding a new column to your data with random numbers and using order by this new column.

  2. To generate random column in your task data, you can use the experimental action "Add or Modify Data Field". You can check how to do it in this github issue. You have to use random() instead of choice() in the Value field.

@makseq makseq closed this as completed Nov 8, 2024
@CarlosNacher
Copy link
Author

Thank you so much. It works perfect for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants