Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Client's handling of Enum Values, to support Python >=3.11 enum handling. #4672

Conversation

tim-win
Copy link
Contributor

@tim-win tim-win commented Mar 19, 2024

Description

The argilla client for v0 Search and Metrics appears to be broken in Python 3.11 and greater. This comes from a simple mis-resolution of enum values, due to some of the poorly documented python 3.11. In short, given this definition of the TaskType enum:

class TaskType(str, Enum):
    text_classification = "TextClassification"
    token_classification = "TokenClassification"
    text2text = "Text2Text"

in Python 3.10 and before, format strings and the format method will return the value of the target enum:

>>> f'{TaskType.text_classification}'
'TextClassification'

However, in python 3.11 and later, format strings will produce a more verbose description:

>>> f'{TaskType.text_classification}'
'TaskType.text_classification'

The client's Search and Metrics code both use the same pattern, producing an incorrect _API_URL and thereby running into 404's:

class Search(AbstractApi):
    _API_URL_PATTERN = "/api/datasets/{name}/{task}:search"

    def search_records(
        self,
        name: str,
        task: TaskType,
        size: Optional[int] = None,
        **query,
    ):
    ...
    url = Search._API_URL_PATTERN.format(name=name, task=task)
    ...
    response = self.http_client.post(
            path=url,
            json={"query": query} if query else None,
        )
    ...

This PR just updates the way task is handled by the url format function in both APIs, a method that produces the correct value in 3.8, 3.10, 3.11, and 3.12.0a7:

>>> f'{TaskType.text_classification.value}'                                                                                            
'TextClassification'
    url = Search._API_URL_PATTERN.format(name=name, task=task.value)

Type of change

(Please delete options that are not relevant. Remember to title the PR according to the type of change)

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested

This has been tested locally using pyenv to swap between python 3.8, 3.10, 3.11, and 3.12.0a7 to confirm 3.11 is the change point. Also, tested this in-situ by trying to run AutoPrompt (which only supports 3.10 and below) with this change, and it appears this change is the last thing stopping autoprompt from supporting 3.11!

Checklist

  • I followed the style guidelines of this project
  • I did a self-review of my code
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I filled out the contributor form (see text above)
  • I have added relevant notes to the CHANGELOG.md file (See https://keepachangelog.com/)

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. area: python sdk Indicates that an issue or pull request is related to the Python SDK language: python Pull requests or issues that update Python code severity: major Indicates that the issue is blocking for users or needs to be addressed soon team: backend Indicates that the issue or pull request is owned by the backend team type: bug Indicates an unexpected problem or unintended behavior labels Mar 19, 2024
Copy link

codecov bot commented Mar 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.20%. Comparing base (5d2bfd2) to head (73ac678).
Report is 1 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #4672       +/-   ##
============================================
+ Coverage    45.81%   89.20%   +43.39%     
============================================
  Files          193      193               
  Lines        11814    11814               
============================================
+ Hits          5412    10539     +5127     
+ Misses        6402     1275     -5127     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@frascuchon frascuchon self-requested a review March 20, 2024 10:01
Copy link
Member

@frascuchon frascuchon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tim-win for your contribution!!!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 20, 2024
@frascuchon frascuchon merged commit 7dd6d57 into argilla-io:develop Mar 20, 2024
15 of 16 checks passed
frascuchon pushed a commit that referenced this pull request Mar 20, 2024
…handling. (#4672)

The argilla client for v0 Search and Metrics appears to be broken in
Python 3.11 and greater. This comes from a simple mis-resolution of enum
values, due to some of the poorly documented python 3.11. In short,
given this definition of the TaskType enum:

```
class TaskType(str, Enum):
    text_classification = "TextClassification"
    token_classification = "TokenClassification"
    text2text = "Text2Text"
```

in Python 3.10 and before, format strings and the format method will
return the value of the target enum:

```
>>> f'{TaskType.text_classification}'
'TextClassification'
```

However, in python 3.11 and later, format strings will produce a more
verbose description:

```
>>> f'{TaskType.text_classification}'
'TaskType.text_classification'
```

The client's Search and Metrics code both use the same pattern,
producing an incorrect _API_URL and thereby running into 404's:

```
class Search(AbstractApi):
    _API_URL_PATTERN = "/api/datasets/{name}/{task}:search"

    def search_records(
        self,
        name: str,
        task: TaskType,
        size: Optional[int] = None,
        **query,
    ):
    ...
    url = Search._API_URL_PATTERN.format(name=name, task=task)
    ...
    response = self.http_client.post(
            path=url,
            json={"query": query} if query else None,
        )
    ...
```

This PR just updates the way task is handled by the url format function
in both APIs, a method that produces the correct value in 3.8, 3.10,
3.11, and 3.12.0a7:

```
>>> f'{TaskType.text_classification.value}'
'TextClassification'
```

```
    url = Search._API_URL_PATTERN.format(name=name, task=task.value)
```

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [x] Bug fix (non-breaking change which fixes an issue)

**How Has This Been Tested**

This has been tested locally using pyenv to swap between python 3.8,
3.10, 3.11, and 3.12.0a7 to confirm 3.11 is the change point. Also,
tested this in-situ by trying to run AutoPrompt (which only supports
3.10 and below) with this change, and it appears this change is the last
thing stopping autoprompt from supporting 3.11!

**Checklist**

- [x] I followed the style guidelines of this project
- [x] I did a self-review of my code
- [x] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the `CHANGELOG.md` file (See
https://keepachangelog.com/)

---------

Co-authored-by: admin <[email protected]>
@tim-win
Copy link
Contributor Author

tim-win commented Mar 20, 2024

🥴 my first open source PR in over a year, I think! Thank you for accepting my change.

@katsuya katsuya mentioned this pull request Apr 16, 2024
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: python sdk Indicates that an issue or pull request is related to the Python SDK language: python Pull requests or issues that update Python code lgtm This PR has been approved by a maintainer severity: major Indicates that the issue is blocking for users or needs to be addressed soon size:XS This PR changes 0-9 lines, ignoring generated files. team: backend Indicates that the issue or pull request is owned by the backend team type: bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants