-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99108: add HACL*-based 1-shot HMAC implementation #126359
base: main
Are you sure you want to change the base?
Conversation
HACL* is at revision fc2e38f4d899ba28665c5b91caedaf35b3b37452.
Adding |
ca5e851
to
d6c92a6
Compare
@msprotz The following code is not liked by MSVC: uint32_t l = 64U;
KRML_CHECK_SIZE(sizeof (uint8_t), l);
uint8_t key_block[l]; // <- this I wonder whether you can do something (like inlining the variable instead) or tell me if it's possible to make MSVC happy. |
d6c92a6
to
66c6a25
Compare
66c6a25
to
186094b
Compare
I'm marking this PR as ready for review even though it fails to build on Windows. We'll address that later (for now, I want to get feedback on the build itself and how to get rid of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometime I forget that you're a cryptography genius 😅
PyMODINIT_FUNC | ||
PyInit__hmac(void) | ||
{ | ||
return PyModuleDef_Init(&_hmacmodule); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is primarily a note for myself when this is closer to being ready.)
Do modules without a m_traverse
get to use DRC to ease contention (which will be especially bad here because of the critical sections)? If not, force the module to do that, or consider making this immortal somehow. (I'll follow up on this--it will take some work, because I don't think we're doing it anywhere else in the stdlib.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(For others: DRC means deferred reference count). This is something I also wondered, namely should I explicitly call Py_VISIT(Py_TYPE(mod))
even if I don't have a state or not? I assume not because that's what multiple modules with m_size = 0
do as well. I don't know where to find an exact reference on that behaviour though.
Ok, I need to think of a way to easily map hashlib<->openssl dispatch for HMAC. Converting into a draft for now. |
Just to set expectations, @R1kM and I looked at what it would take to add streaming HMAC and we're relatively confident that it can be done within a few weeks. We have a couple more urgent things, then we'll get to it and post here as soon as we have an update, and then it'll simplify everything to have a single implementation that can do streaming too. |
Ok, so I managed to fix the compilation and find a nice way to resolve OpenSSL names. However, since I need to bind OpenSSL for Windows builds, I'd like someone to create the project files for Windows since I don't have .NET installed (and honestly I don't want to install it; I could use a VM if I have time). So I'll just ignore the Windows build failures for now.
No hurry, take your time! It'd be nice if the API for all HMAC-HASH functions are the same (or at least, that the functions can be stored using the same function pointer type). |
There are two styles of APIs, and we're not sure yet which one we want to go for. Your feedback is most welcome. Agile APIs are of the form
Plain APIs are of the form
Thoughts on which one you'd prefer? |
Thank you for the reply! Some questions:
I assume however that performances are slightly better since you don't have indirection, right? How better do they fare in general? are plain APIs really better than agile ones? Agile API seems good because I only need to specify the |
generally there's no noticeable performance impact unless you repeatedly hmac a lot of very small data, and repeatedly create/destroy the underlying python object, meaning you end up calling malloc/free a lot the plain APIs exist for engineering reasons: you can package a single hmac algorithm along with its plain API, but if the agile API is the only API offered, a client like Python has to take all of the hmac algorithms in one go (since the toplevel agile API now contains references to every individual algorithm) if you intend to overhaul all of the hmac algorithms in one go, then I would do agile API |
FTR, I have a branch with streaming HMAC: https://github.com/picnixz/cpython/tree/hacl/HMAC-stream-99108. For now, I'm assuming that I have a plain API that can be used as follows: /* Function pointer type for HACL* streaming HMAC state allocation */
typedef void *(*HACL_HMAC_state_malloc_func)(void);
/* Function pointer type for HACL* streaming HMAC state deallocation */
typedef void (*HACL_HMAC_state_free_func)(void *state);
/* Function pointer type for HACL* streaming HMAC state copy */
typedef void *(*HACL_HMAC_state_copy_func)(void *state);
/* Function pointer type for HACL* streaming HMAC setup functions. */
typedef hacl_exit_code
(*HACL_HMAC_setup_func)(void *state, uint8_t *key, uint32_t len);
/* Function pointer type for HACL* streaming HMAC update functions. */
typedef hacl_exit_code
(*HACL_HMAC_update_func)(void *state, uint8_t *buf, uint32_t len);
/* Function pointer type for HACL* streaming HMAC digest functions. */
typedef hacl_exit_code
(*HACL_HMAC_digest_func)(void *state, uint8_t *out);
Ideally, I would like to expose the transformation function For now, the branch I mentioned assumes that the state consists of the message until now and the key (just to make it work). Now, using an agile API, I could lighten the current interface to avoid storing those function pointers and only store the |
I'm coming back online after being off last week. Made some good progress on an agile HMAC interface, hopefully first prototype by the end of the week. It will be agile, meaning you won't have to deal with function pointers, the dynamic dispatch will be internal. The malloc function will take an algorithm and a key, and you will also have access to the copy function you mentioned. This should all greatly simplify your life. Hoping to send a link to the PR soon. |
That would be perfect! Thanks in advance! I still need to know how I can make the Windows build work :( But I should have prepared everything to support the agile API. |
Just for those subscribed to this pull request. I plan to re-open a fresh one using picnixz/cpython@feat/hmac/hacl-99108 in order to incorporate both streaming and 1-shot HMAC at the same time. However, we can continue the design discussion on this PR until the HACL* part is done. |
I'm opening a PR but I'm pretty sure that there are multiple things to discuss:
Hacl_HMAC.h
, but they are included in the.c
; you might want to check that on your side @msprotz).Setup.stdlib.in
, I get a bunch of warnings "Makefile:3471: warning: overriding recipe for target 'Modules/_hacl/Hacl_Hash_MD5.o" for instance.libHacl_Hash_Blake2.a
(I get "gcc: error: Modules/_hacl/libHacl_Hash_Blake2.a: No such file or directory").The build I suggest compiles on my machine and I would appreciate if @erlend-aasland could help me in cleanng it up.
I'm leaving my dev session in a few hours and won't be back until Wednesday. I need to test the implementatino as well (for SHA1, I think something is weird because I get different results, well mainly junk at the end but I'll need to check again when I'm back).