Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix lower case create statement #267

Merged
merged 1 commit into from
May 13, 2024

Conversation

initzhang
Copy link
Contributor

@initzhang initzhang commented May 13, 2024

The current dbgpt_hub/data_process/sql_data_process.py script does not consider there actually exists lowercase create statements in the spider dataset. For example, activity_1/schema.sql contains create table Activity ... which is not captured by the r"CREATE\s.*?;" pattern in line of code here. The consequence is that some samples in the example_text2sql_dev.json have blank Instruction field when enable the --code_representation:

{
    "db_id": "pets_1",
    "instruction": "I want you to act as a SQL terminal in front of an example database, you need only to return the sql command to me.Below is an instruction that describes a task,      Write a response that appropriately completes the request.\n\"\n##Instruction:\n[]\n",
    "input": "###Input:\nHow many dog pets are raised by female students?\n\n###Response:",
    "output": "SELECT count(*) FROM student AS T1 JOIN has_pet AS T2 ON T1.stuid  =  T2.stuid JOIN pets AS T3 ON T2.petid  =  T3.petid WHERE T1.sex  =  'F' AND T3.pettype  =  'dog'",
"history": []
}

This pull request makes the specific pattern case-insensitive to fix this problem.

create_statements = re.findall(
    r"CREATE\s.*?;", schema_content, re.DOTALL**|re.IGNORECASE**
)

Copy link
Member

@wangzaistone wangzaistone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+ , it's a good detail

@wangzaistone wangzaistone merged commit 9c35199 into eosphoros-ai:main May 13, 2024
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants