Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about AST evaluation for Java #494

Closed
GeniusYx opened this issue Jul 1, 2024 · 3 comments
Closed

Question about AST evaluation for Java #494

GeniusYx opened this issue Jul 1, 2024 · 3 comments
Labels
BFCL-General General BFCL Issue

Comments

@GeniusYx
Copy link

GeniusYx commented Jul 1, 2024

Hello, I am testing my own model. The test set is java. There is an example:

The output of my model is {'invokemethod007_runIt': {'args': ['suspend', 'log'], 'out': 'debugLog'}}. When I execute the code, it seems that the code forces all the parameter values to be of type string: {'invokemethod007_runIt': {'args': "['suspend', 'log']," 'out']: 'debugLog'}} , but the real expected answer is {'invokemethod007_runIt': {'args': [['suspend','log']], 'out': ['debugLog']}} .

As a result, the final evaluation result error type is type mismatch. Do you have a solution? Thank you very much!

@GeniusYx GeniusYx added the hosted-openfunctions-v2 Issues with OpenFunctions-v2 label Jul 1, 2024
@HuanzhiMao
Copy link
Collaborator

Hi @GeniusYx,

All parameters are forced to be string type because all evaluation scripts are in Python and casting the values to string can prevent them from erroring during the process. We then use tree-sitter (which takes in a string and outputs the converted value in Python syntax) to handle the parsing and type-checking part for the Java/JS test category. A detailed explanation can be found at #424.

From your description, it seems that the args parameter is supposed to be a list. Since this is in Java category, you should use the Java syntax for list in the possible answer as well (new String[]{xxxx}). Take a look at this entry from the BFCL possible answer and notice how it creates a list of class Point objects through new Point[]{xxxx}

Let me know if this solves your issue!

@GeniusYx
Copy link
Author

GeniusYx commented Jul 2, 2024

Thank you very much for your reply!

I noticed that the answers in the java test set possible_answer folder are still in json format. Could you please provide the java format answers for the possible answers?

Thank you very much!

@HuanzhiMao
Copy link
Collaborator

There are no truly-java-format possible answers. The json format possible answers are loaded as the Python-type values. We use the Java tree-sitter to parse and convert model result into their corresponding Python-type values. And then the accuracy checking part is performed between the two Python-type values. In this way, we can re-use the whole evaluation pipeline for non-Python languages as well.

@HuanzhiMao HuanzhiMao added BFCL-General General BFCL Issue and removed hosted-openfunctions-v2 Issues with OpenFunctions-v2 labels Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BFCL-General General BFCL Issue
Projects
None yet
Development

No branches or pull requests

2 participants