Evaluation returns 1 when it should return 0 #7

Hazoom · 2021-05-24T15:07:23Z

Hi,

I was running the evaluation script on my predicted SQL queries on Spider dataset, and I've noticed some for some examples, the evaluation script returns an Exact-Match score of 1 instead of 0.

For example:

Pred: select students.first_name from students where students.permanent_address_id != students.permanent_address_id
Gold: select first_name from students where current_address_id != permanent_address_id

In this example, one can notice the in the gold query, the where clause is using the current_address_id column in the left expression while in the predicted query the column is permanent_address_id. This should lead to an EM score of 0 in the where clause, thus leading to a overall EM score of 0, while your script return 1.

Another example:

Pred: select count(*) from flights where flights.destairport = 'terminal'
Gold: select count(*) from flights where sourceairport = "apg"

Here, the problem is the same, but with the columns destairport and sourceairport.

I looked into the code, and my guess is that it relates to the foreign key mapping that is performed right at the beginning of the evaluation of each sample. Lines 621-627 in here: https://github.com/taoyds/test-suite-sql-eval/blob/master/evaluation.py#L621

Would love to hear your thoughts on that. @taoyds

Thanks,
Moshe

The text was updated successfully, but these errors were encountered:

ReinierKoops · 2021-06-18T08:04:13Z

If you change the variable
DISABLE_VALUE = True to False in evaluation.py, you should see that the Exact-Match score is 0.
I think this is because enabling allows the evaluation of the actual variables instead of only the syntax.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation returns 1 when it should return 0 #7

Evaluation returns 1 when it should return 0 #7

Hazoom commented May 24, 2021 •

edited

Loading

ReinierKoops commented Jun 18, 2021 •

edited

Loading

Evaluation returns 1 when it should return 0 #7

Evaluation returns 1 when it should return 0 #7

Comments

Hazoom commented May 24, 2021 • edited Loading

ReinierKoops commented Jun 18, 2021 • edited Loading

Hazoom commented May 24, 2021 •

edited

Loading

ReinierKoops commented Jun 18, 2021 •

edited

Loading