-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic Column Selection #359
Conversation
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Vijay Viswanathan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ritugala , this looks really exciting!
I had several comments, and wherever possible I tried to use the github suggestion functionality so you can just accept the comments through github if you want.
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Graham Neubig <[email protected]>
Co-authored-by: Graham Neubig <[email protected]>
Co-authored-by: Graham Neubig <[email protected]>
Co-authored-by: Graham Neubig <[email protected]>
…tu-column-selection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add some minimal comments in the last minute. Sorry that preparing for my English test takes too much time.
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Eren Chenyang Zhao <[email protected]>
Co-authored-by: Eren Chenyang Zhao <[email protected]>
prompt2model/dataset_retriever/description_dataset_retriever.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few minor comments - please address them, and then feel free to merge!
This is great work. This PR clearly required a lot of understanding of our system and external components, so this is a great milestone for your first major PR @ritugala!
Co-authored-by: Graham Neubig <[email protected]>
Co-authored-by: Vijay Viswanathan <[email protected]>
Co-authored-by: Vijay Viswanathan <[email protected]>
Description
A good starting point for reviewing this PR would be: https://github.com/neulab/prompt2model/pull/359/files#diff-952ef426d80e42d77fe849ad91b5ef92bcd907ea702707fb1d9a714d9e31820cR169
Changes:
-Added a file for creating the prompt for column selection, it is similar to the instr_parser_prompt file; The prompt includes incontext learning as well.
-Created a function in dataset_retriever for automatic column selection
-Also included the prompt spec to be passed in canonicalize_dataset_by_cli so that I can use the instruction in column selection prompt.
-Added tests for automatic column selection
-Modified the JSON parser file to handle responses where the JSON keys would have lists as the values (and not just strings)
-Set an explicit value for max_api in json parser incase it is not set - it was getting stuck in an infinite loop otherwise, and constantly calling the API
Issues/Discussions/Question
prompt2model/prompt2model/utils/parse_json_responses.py
Line 101 in ad238c8