-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors in post-perturbation queries #1
Comments
Hi Irina,
In addition, kindly use this taoyds/test-suite-sql-eval#13 for test suite evaluation prior to merging it into the main repository or remove https://github.com/taoyds/test-suite-sql-eval/blob/master/evaluation.py#L569 in the original test-suite script. The original test-suite scripts might contain inaccuracies for columns and tables that have "value" as a part of their names. |
Thanks for the quick fix! Do these changes affect the leaderboard? |
After testing the Picard model on the updated data, we found that the execution accuracy was affected by approximately 0.1 points, and the exact set match accuracy was affected by about 0.5 points for DB_schema_abbreviation and DB_schema_synonym. We will update the leaderboard with these new results, as well as the latest results from other models soon. |
Great, thanks a lot! |
Could you please check the databases 'world_1_0' , 'world_1_3', 'world_1_4' in the DB_schema_synonym post-perturbation set? I think the actual column names in the sqlite_sequence table slightly differ the version provided in tables_post_perturbation.json. |
Thank you for finding it! The sqlite_sequence table was automatically created by SQLite for its internal use. We will skip the perturbations to the sqlite_sequence table in this PR. |
Thank you for releasing the dataset!
For example, gold query #567 from the DB_schema_synonym post-perturbation set
fails because sqlite cannot resolve the ambiguous column names (airports.capital). The original SQL query uses aliases T1, T2, T3 and thus does not have this problem.
includes museums opened in 2013 and 2008 because the column
accessible_from
has the text format in the database (so here it is string comparison, not number).The original query performs strict comparison as it uses the integer values
2013
and2008
and executes with a different result.The text was updated successfully, but these errors were encountered: