Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect unit tests in FAIL_TO_PASS and PASS_TO_PASS #275

Open
WuYff opened this issue Dec 17, 2024 · 1 comment
Open

Incorrect unit tests in FAIL_TO_PASS and PASS_TO_PASS #275

WuYff opened this issue Dec 17, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@WuYff
Copy link

WuYff commented Dec 17, 2024

Describe the bug

Description:
The FAIL_TO_PASS and PASS_TO_PASS fields of some instances contain unrelated strings instead of references to unit tests.

An example is provided below. django__django-16950 has ["If form data is provided, a parent's auto-generated alternate key is"] as its FAIL_TO_PASS and some comments as its PASS_TO_PASS

I haven't checked thoroughly, but I can see django__django-15525 and django__django-14792 also have the same problem. Not sure if this will affect the actual evaluation of swebench.

Steps/Code to Reproduce

Buggy Example: django__django-16950

image

Expected Results

The actual unit tests

Actual Results

As described above

System Information

No response

@john-b-yang
Copy link
Member

I think this is the legitimate name of a test, for instance here's the PASS_TO_PASS test referenced in the image.

I'm not sure this needs fixing. Django has its own custom testing software iirc (it doesn't use pytest). From when I last ran, I think the test name + result is printed out (e.g. <test name> ... ok or <test name> ... fail.

Leaving this open for discussion. My current stance is that this doesn't need fixing + is the expected behavior for Django testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants