-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement sparse covariance_eigh
PCA using Dask
#3263
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3263 +/- ##
==========================================
+ Coverage 77.01% 77.18% +0.16%
==========================================
Files 110 111 +1
Lines 12492 12582 +90
==========================================
+ Hits 9621 9711 +90
Misses 2871 2871
|
if isinstance(adata.X, DaskArray) and issparse(adata.X._meta): | ||
reason = "TruncatedSVD is not supported for sparse Dask yet" | ||
request.applymarker(pytest.mark.xfail(reason=reason)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Intron7 you said it would be possible to adapt the code to do TruncatedSVD as well?
I think this PR is pretty clean, so maybe we can extend it in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, PR into this PR if we start pre-merge
Co-authored-by: Ilan Gold <[email protected]>
covariance_eigh
PCA using Dask
Can we add a test here to find out (or to at least know) how far our implementation here is from #3296 for sparse? |
Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If everything passes CI, good to go from me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
An alternative would be to subclass
PCA
, but that would involve erroring out or reimplementing all of its options.Ideally #3267 would be merged first and this one integrated into its improved decision tree.