-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC Add intelex inference example #303
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this example. This is not an in depth review yet, just what I found on first look:
The example is not actually being run but just included as is into the docs, as can be seen here:
I'm not completely sure why that happens here but not with the other examples, Possibly, this example needs to be linked explicitly in the docs somewhere? A good place would be in:
(which would be a good addition regardless of whether that is the root of the issue or not)
In general, I think this guide could be more useful if it was made a bit more accessible. E.g. explain in a few words what intelex does, link to its docs, and avoid too many abbreviations like "skl" and "sklex".
Hi @BenjaminBossan Let me know what else you want to see! |
The example is still not building correctly but I believe I have figured out the issue. It seems like the file name should always start with Btw. you can build the docs locally by going into |
Hi @BenjaminBossan thanks for the tip. The latest commit worked locally so I hope this time the example runs. I had an interesting thought. At the bottom of my example I show that loading a pickled non-optimized model has no speedup, even when |
The doc build is failing, but it's still progress :) The trouble now is that building the docs has a dependency on sklearnex, which is not installed. Could you please add it to the dependencies as a "doc" dependency?
I think this will depend on how exactly the patching works, which I don't know. Ideally, there would be an easy way to check on each estimator if it was patched, something like |
Progress indeed. The Edit: There is an unreleased function that allows you to see the global patching status in the latest |
By the way, the pipeline now complains about the missing HF token. I followed the approach that I found in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Andreas for adding this example. I have now given it a more thorough review. Overall it looks good, but I think a few things could be improved further. Please take a look at my comments.
By the way, the pipeline now complains about the missing HF token. I followed the approach that I found in plot_hf_hub.py, so I think the error is nothing new. How is this expected to work?
Yes, that's unfortunately expected, see #47. As soon as this is merged, it should work, as it would use the HF token set up for this repo. When I tested this locally with my token, it worked.
Hi Benjamin. Thanks! I hope all your comments are addressed with the latest commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @ahuber21 , it's in a pretty good shape.
I left a few minor comments. I haven't made sure the line lengths are correct. You don't need to "accept suggestion" here, you can take the texts and apply them the way you like.
examples/plot_intelex.py
Outdated
from sklearn.neighbors import KNeighborsClassifier | ||
|
||
clf = KNeighborsClassifier(3) | ||
clf.fit(X_train, y_train) | ||
|
||
# %% | ||
# Training the optimized model | ||
# ============================ | ||
# We apply ``patch_sklearn()`` and reimport the model to load the patched | ||
# version. A message is shown, telling us that Intel(R) Extension for | ||
# Scikit-learn* has been enabled. | ||
patch_sklearn() | ||
from sklearn.neighbors import KNeighborsClassifier | ||
|
||
clf_opt = KNeighborsClassifier(3) | ||
clf_opt.fit(X_train, y_train) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in a script, this is very confusing. It'd be nicer if we explicitly import the class from sklearnex for it to be clear which one it is, and leave a comment that users don't have to change imports, and they can only add the call to patch on top of the file before they run imports.
We can also here show training times and compare them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the imports. Training for kNN is <1s and in many cases sklearnex
comes with a few percent of overhead. We're working on that, but for now I think it wouldn't help to add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in that case, it does help to add it, since right now we're only selectively showing data, not the whole truth, and that's not what we wanna do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a perf_counter
for the fit stage for completeness. On my machine, both fit times were ~0.05s. The numbers are printed, but doing anything else with them (like calculating ratios) won't help if we don't do multiple runs. I hope you're fine with how it is now. Let me know :)
2dce17e
to
0dd8640
Compare
Build failing, once you're done with edits, please ping me for a second review. |
Don't quite understand what happened with the ubuntu tests. To me it looked unrelated. Anyway, I just pushed an update without the "Download model and re-evaluate" section, so there'll be a new run. |
9471ca6
to
1eb5d5f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the nits, LGTM. WDYT @BenjaminBossan
8fc0c67
to
bad02d2
Compare
bad02d2
to
e73cfb9
Compare
e73cfb9
to
e047a1d
Compare
Force-pushed again because in the intro I still said that we download the models again. Nothing else changed. |
@ahuber21 we squash and merge anyway, please avoid force pushing since it makes reviewing changes harder :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looks good, thanks Andreas
@BenjaminBossan this feature is now added in the latest release. from sklearnex import is_patched_instance
my_instance = pickle.load(open("some_file.pkl", "rb"))
is_patched_instance(my_instance) # true if it was trained using sklearnex, false otherwise |
Nice addition, thx for letting us know. |
I want to give a self-contained example how scikit-learn-intelex speeds up model inference times. This effort was kicked off in #251 and the PR in this form is basically just to align the requirements.
I'm new to the discussion, so I'm including just the most basic things. Please let me know if this example is a step in the right direction.
Output from running on my machine: