Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make package PEP 561 compliant; add typehint stub file #49

Merged
merged 3 commits into from
Aug 7, 2024

Conversation

jonded94
Copy link
Contributor

@jonded94 jonded94 commented Aug 5, 2024

Has to be merged after #47.
Makes this package properly PEP 561 compliant by introducing a py.typed and stub files for the underlying C extension.

Summary by Sourcery

Make the package PEP 561 compliant by adding type hint stub files and updating setup.py to include package data.

New Features:

  • Introduce type hint stub file crc32c/__init__.pyi for the underlying C extension.

Enhancements:

  • Make the package PEP 561 compliant by adding a py.typed file and including package data in setup.py.

Copy link

sourcery-ai bot commented Aug 5, 2024

Reviewer's Guide by Sourcery

This pull request makes the package PEP 561 compliant by adding a 'py.typed' file and type hint stubs for the underlying C extension. The 'setup.py' file was updated to include the 'crc32c' package and to ensure package data files are included. A new 'crc32c/init.pyi' file was created to provide type hints for the 'big_endian' and 'hardware_based' variables, as well as the 'crc32' and 'crc32c' functions.

File-Level Changes

Files Changes
setup.py
crc32c/__init__.pyi
Made the package PEP 561 compliant by adding a 'py.typed' file and type hint stubs.

Tips
  • Trigger a new Sourcery review by commenting @sourcery-ai review on the pull request.
  • Continue your discussion with Sourcery by replying directly to review comments.
  • You can change your review settings at any time by accessing your dashboard:
    • Enable or disable the Sourcery-generated pull request summary or reviewer's guide;
    • Change the review language;
  • You can always contact us if you have any questions or feedback.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jonded94 - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 1 issue found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

Comment on lines 6 to 7
def crc32(inp: Buffer, crc: int = 0, release_gil: bool | int = -1) -> int: ...
def crc32c(inp: Buffer, crc: int = 0, release_gil: bool | int = -1) -> int: ...
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider clarifying the release_gil parameter type and usage

The bool | int type hint for release_gil with a default of -1 might be confusing. Consider using an enum or separate boolean flags to make the intended usage more explicit. This would improve the API's clarity and make it easier for users to understand the parameter's purpose.

from enum import Enum, auto

class GILRelease(Enum):
    DEFAULT = auto()
    RELEASE = auto()
    NO_RELEASE = auto()

def crc32(inp: Buffer, crc: int = 0, release_gil: GILRelease = GILRelease.DEFAULT) -> int: ...
def crc32c(inp: Buffer, crc: int = 0, release_gil: GILRelease = GILRelease.DEFAULT) -> int: ...

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 6, 2024

@rtobar rebased onto new master, squashed into one commit. Type hint for gil_release_mode say purely int now.

@jonded94 jonded94 changed the title Draft: Make package PEP 561 compliant; add typehint stub file Make package PEP 561 compliant; add typehint stub file Aug 6, 2024
@rtobar
Copy link
Contributor

rtobar commented Aug 6, 2024

@jonded94 again, thanks for this submission. I am not expert on typing, and have never maintained a stub file myself. However, the fact that you had to add a crc32c package with stuff inside made me suspicious , so I went looking into the PEP, where I read that

This PEP does not support distributing typing information as part of module-only distributions or single-file modules within namespace packages.

You mentioned that this worked for you though, but by the sounds of it, it might be more chance than because all rules are followed.

I do like the idea of distributing typing hints; however, it seems that a pre-requisite for this is that I turn this module-only distribution into a package instead,. This is anyway something I've waned to do for a while, just never got around doing It since there was no pressure to do so.

So: let me try doing that first. It shouldn't be a difficult change though, and it will also be good time to reorganise source files around.

In the meanwhile I do have one question: is there a way we can verify on CI that the typing is somewhat correct? For instance, could we maybe install mypy and run it against the test code and verify it comes back with no issue? I'd do that as a separate job, maybe even workflow, on our GHA.

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 6, 2024

I am not expert on typing, and have never maintained a stub file myself.

First, let me start with that modifying crc32c such that it is a package is not strictly required just to make type hinting work.

It works without problems like this:

site-packages/
   crc32c/
      __init__.pyi
      py.typed
   crc32c.[...].so

Both the interpreter can find the actual library (the .so file), and mypy can find proper type hints.

Why does it not work with a crc32c.pyi file at top level?

The only problem with a stub file would be that the required py.typed file can't be placed anywhere meaningfully. Therefore, these stub files have to be in a proper package folder, with a py.typed file besides it.

Nevertheless:

I just pushed some slight change that transforms crc32c into a proper package. (EDIT: Sorry, current state is a bit bonkers, will clean up what I did; nevertheless, the approach above still would Work) This leads to following folder structure under site-packages:

site-packages/
   crc32c/
      __init__.cpython-[...].so
      __init__.pyi
      py.typed

This is a pretty standard approach, I'd say. This does remove the need of some boilerplat-y __init__.py that just exists to direct calls to an underlying _crc32.so file.

With that, a proper folder for this package exists and it's not longer a single-file distribution. mypy is also happy because it finds both the marker file py.typed and the stub file.

Unfortunately, I'm seeing some weird linker errors on Windows: https://github.com/ICRAR/crc32c/actions/runs/10270554189/job/28418529101?pr=49
Don't have the time to look into that right now, but can do later. I've built wheels locally on my machine though and what I describe here is based on these wheels.

I am not expert on typing, and have never maintained a stub file myself. [...] is there a way we can verify on CI that the typing is somewhat correct? For instance, could we maybe install mypy and run it against the test code and verify it comes back with no issue?

Sure, I definitely can help there since I probably type hinted hundreds of thousands of lines of Python and introduced type hinting in general into many large codebases.

Just as an actual example, this is mypy now working properly on this package:

[crc_mypy.py]
import crc32c

hash: int = crc32c.crc32(b"", value=0)

wrong_thing: str = crc32c.crc32(b"", value=b"123", gil_release_mode="123")
$ mypy --strict crc_mypy.py
crc_mypy.py:5: error: Incompatible types in assignment (expression has type "int", variable has type "str")  [assignment]
crc_mypy.py:5: error: Argument "value" to "crc32" has incompatible type "bytes"; expected "int"  [arg-type]
crc_mypy.py:5: error: Argument "gil_release_mode" to "crc32" has incompatible type "str"; expected "int"  [arg-type]
Found 3 errors in 1 file (checked 1 source file)

One would probably at least want to type check the test_crc32c.py file in the CI. This is a bit tricky though..

For one reason, there are no type hints at all right now in this file, of course. This is not hard to do, but keep in mind that type hints were only introduced in Python 3.5 (https://peps.python.org/pep-0484/). So, type hinting the test file would lead to this file only being executable in Python >3.5.

Also, there are some pretty un-pythonic and unexpeceted things happening in this file. For example, right at the start:

try:
    import crc32c
    sw_mode = os.environ.get('CRC32C_SW_MODE')
    if sw_mode == 'none' and not crc32c.hardware_based:
        raise RuntimeError('"none" should force hardware support')
    elif sw_mode == 'force' and crc32c.hardware_based:
        raise RuntimeError('"force" should force software support')
except ImportError:
    crc32c = None

mypy can't really now what crc32c now really is. Is it a whole module or a simple None?

If you'd want to have this test file type checked in CI, this definitely would be a slightly bigger endeavour. Maybe the easiest if one simply removes that "sw mode" is set through environment variables, but do it only via a kwarg.

Alternatively: One could create another, very simple test file though and only check that in CI.

@rtobar rtobar force-pushed the make-package-pep-561-compliant branch from 775405f to edacda4 Compare August 7, 2024 02:33
@rtobar
Copy link
Contributor

rtobar commented Aug 7, 2024

@jonded many thanks again for all your inputs, I've been learning a few things on the way.

After my message above I also worked on converting the distribution into a package. It builds in all platforms and all tests run successfully. It's not as simple as your approach (I still build a separate extension, then import its members in __init__.py). You mentioned your approach is pretty standard, although I've never seen it used in other packages, so I'm not sure whether it is supposed to work and you forgot some small detail, or whether it's not guaranteed to work. Regardless, since I did my version of the changes (and took the opportunity to introduce a src/ directory, etc) I decided to rebase your changes on top of mine, re-organise things slightly to accommodate for the new structure. Seeing how the stub seems to work fine on my end after all of this, and how you were previously receptive to my suggestions in #47, I decided to force-push as well. The intention was to be pragmatic, save time and focus on the type-hinting discussion, so hopefully this force-push is not taken as an aggressive move from my part.

(EDIT: I forgot to mention that I kept the old contents of your branch on this repo's make-package-pep-561-compliant branch.)

Yes, the code in the test module is far from ideal. I still carry the baggage from the 1.x version of the package that would fail to load if no HW extensions were present. You are right that this is not the greatest approach. The intention was always to give users the ability to ensure they are using the hardware-based algorithm., so they would never accidentally end up with the slow version. If anything, this logic could be supported as a RuntimeError rather than an ImportError.

Also, as you also point out, type hints are "relatively new" (I had in my mind they were 3.7+, but I was clearly confused there too), and we still advertise 2.7+ compatibility. Only a very, very minor percentage of our users use anything other than 3.7+ though, so it's reasonable to bump up that python version requirement for the package, remove support for old python versions, and THEN finally be able to use the test module as another source of type hinting verification. In the meantime some small snippets of code that are known to fail/succeed would be a first good approach. Reading the help we can even use mypy -c 'expressions' with several small expressions?

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 7, 2024

After my message above I also worked on converting the distribution into a package.

Nice! Thank you very much, this looks very nice. Sorry, I just somewhat winged it and was busy with other things.

You mentioned your approach is pretty standard, although I've never seen it used in other packages, so I'm not sure whether it is supposed to work and you forgot some small detail,

mypy for example is such a package that even has its __init__ file as a shared library:

$ find venv/lib/python3.10/site-packages/mypy -type f -name "__init__.*.so"
venv/lib/python3.10/site-packages/mypy/server/__init__.cpython-310-x86_64-linux-gnu.so
venv/lib/python3.10/site-packages/mypy/__init__.cpython-310-x86_64-linux-gnu.so
venv/lib/python3.10/site-packages/mypy/dmypy/__init__.cpython-310-x86_64-linux-gnu.so
venv/lib/python3.10/site-packages/mypy/plugins/__init__.cpython-310-x86_64-linux-gnu.so

Small sidefact: This is because mypy itself is compiled with mypyc, a Python-to-C-to-machine-code compiler that uses type hinting information.

But yeah, definitely I messed something up on my try, doing it like you did introduces a slight overhead but is definitely far more standard and should lead to less headaches.

Baked in C source files

Something small that I noticed: You now include the C source files as a part of your src directory. This leads to them being baked into the python wheel (I also checked the artifacts from the CI jobs, and they indeed are baked in):

adding 'crc32c/__init__.py'
adding 'crc32c/__init__.pyi'
adding 'crc32c/_crc32c.cpython-310-x86_64-linux-gnu.so'
adding 'crc32c/py.typed'
adding 'crc32c/ext/_crc32c.c'
adding 'crc32c/ext/checkarm.c'
adding 'crc32c/ext/checkarm.h'
adding 'crc32c/ext/checksse42.c'
adding 'crc32c/ext/checksse42.h'
adding 'crc32c/ext/common.h'
adding 'crc32c/ext/crc32c.h'
adding 'crc32c/ext/crc32c_adler.c'
adding 'crc32c/ext/crc32c_arm64.c'
adding 'crc32c/ext/crc32c_sw.c'

The approach that I know would be slightly different:

src/
    crc32c/
        __init__.py
        ....
crc32c/
    crc32c.h
    _crc32c.c
    ...

One would include the crc32c directory in the sdist, but for the wheel part, of course only use it to compile the shared library. In the wheel, only the src directory (with the .so file) would be included.

Only a very, very minor percentage of our users use anything other than 3.7+ though, so it's reasonable to bump up that python version requirement for the package, remove support for old python versions, and THEN finally be able to use the test module as another source of type hinting verification.

Very nice! Dropping all of the old baggage actually enables quite a bunch of stuff :) If you are willing to drop support for old versions, I could try typehinting the current test file. This shouldn't take too long.
If I stumble upon things that can only be solved by modifying the actual C library (for example, change how this "sw mode" works), I'd heavily suggest to just annotate a "ignore this error for now" comment instead of further delaying a release.

These ignore comments are not too bad and are very easily cleaned up afterwards.

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 7, 2024

I dropped support for Python <3.7 in the tests file, fully type hinted it and tested that locally with Python 3.7 & 3.10. mypy --strict does not show any problems on my machine; I included a little CI linting job, but for some reason it doesn't trigger? (I'm more used to JetBrains TeamCity and/or GitLab, sorry)

Do you want to drop support for Python <3.7 for the whole package (either in this branch or on master)? I believe the tests should possibly fail now if being executed on Python 2.X or <3.5.

Things to tackle later maybe:

Parametrizing the tests

numbers1 = ('Numbers1', b'123456789', 0xe3069283)
numbers2 = ('Numbers2', b'23456789', 0xBFE92A83)
numbers3 = ('Numbers3', b'1234567890', 0xf3dbd4fe)
[...]
if crc32c is not None:
    for name, val, checksum in (numbers1, numbers2, numbers3, phrase, long_phrase):
        classname = 'Test%s' % name
        locals()[classname] = type(classname, (unittest.TestCase, Crc32cChecks), {'val': val, 'checksum': checksum})

This block could just be replaced by a pytest.mark.parametrize decorator on a proper class in the future.

With that, this also wouldn't be required anymore:

class Crc32cChecks(object):
    checksum: int
    val: bytes

    def assertEqual(self, a: Any, b: Any, msg: Any = None) -> None: ...

Let "sw mode" be set via kwarg

With that, this block

try:
    import crc32c
    sw_mode = os.environ.get('CRC32C_SW_MODE')
    if sw_mode == 'none' and not crc32c.hardware_based:
        raise RuntimeError('"none" should force hardware support')
    elif sw_mode == 'force' and crc32c.hardware_based:
        raise RuntimeError('"force" should force software support')
except ImportError:
    crc32c = None  # type: ignore[assignment]

could also just be replaced entirely with a parameter in pytest.mark.parametrize.

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 7, 2024

Now the linting job was triggered. Brighter minds than me have to understand why the path triggering does not work (does push: paths: - '*.py' check per commit and not changes against branch to merge into?)

@jonded94
Copy link
Contributor Author

jonded94 commented Aug 7, 2024

Sorry for spamming, final remarks:

  • Apparently this package is shipped for Python >=3.6 anyways? Then there really is no problem at all (as can be seen in the tests too, they are all successful).
    • But this still definitely should be reflected in the classifiers which specify which Python versions are supported, currently it says 2.7, 3.2 - 3.13. Which is technically true for the sdist, but the tests wouldn't run anymore I guess.
  • Type checking with mypy now seems to work, I just enabled it to run always.
    • Technically, we're not testing the wheel that comes out of the build jobs but rather build the wheel directly inside the linting job
    • I personally think that's fine.

Should be ready for review! :)

@rtobar rtobar force-pushed the make-package-pep-561-compliant branch from cd6ab5a to 61b34f1 Compare August 7, 2024 12:03
@rtobar rtobar force-pushed the make-package-pep-561-compliant branch from 61b34f1 to f1e12a5 Compare August 7, 2024 12:04
@rtobar
Copy link
Contributor

rtobar commented Aug 7, 2024

(I had written a reply before squashing and force-pushing, I think I missed pressing "Comment" 😢)

Sorry for spamming, final remarks:

I love this kind of spam!

@jonded94 this has been a very useful and positive discussion, and I thank you for that. The changes are in a state that I'll be merging them, only that I've squashed the CI setup commits into a single one, and added a scheduled run in there too, so that we don't bitrot against newer versions of mypy.

Yes, we advertise 2.7+ support because of the source distribution. As you point out, it's a pretty wild guess whether it still works or not, although the subset of the Python C API we use is very limited and has been stable throughout the whole 3.x series. I'll probably officially drop 2.7 support (testing changes right now), but won't bother limiting the 3.x versions, since it doesn't gain us anything. After this, I'll release a new version on PyPI.

Future work, all coming after a new release:

  • I'll have a look into the C code baked into the wheels. I hadn't realised about it, thanks for pointing out. I've seen other packages do something similar to what you propose, and never thought this could be a motivation to do it that way.
  • Using pytest is something that I learned later in my Python journey. This is definitely in my TODO list, and is actually something I'm already toying with in other library I maintain, https://github.com/ICRAR/ijson/
  • SW mode will indeed be introduced as a parameter usable on a call-by-call basis. I'll probably still keep the environment variable anyway
  • I'll remove the "ability" to fail importing the module in certain cases. Technically a backwards-incompatible change that would require a new major version (and thus might put people off from upgrading), but I don't think anyone will miss it -- can be considered a bug in the design :P.

@rtobar rtobar merged commit f1e12a5 into ICRAR:master Aug 7, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants