Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: put BlockManager constructor in cython #40842

Merged
merged 17 commits into from
Apr 19, 2021

Conversation

jbrockmendel
Copy link
Member

No idea why mypy started complaining about the io.parsers stuff

@jreback
Copy link
Contributor

jreback commented Apr 14, 2021

can you merge master

@jreback jreback added the Performance Memory or execution speed performance label Apr 14, 2021
@jreback
Copy link
Contributor

jreback commented Apr 14, 2021

any perf on this vs master (now that the ndarray stuff is merged)? (ideally put the relevant timeits in the cython code, that is a great idea generally)

@jbrockmendel
Copy link
Member Author

any perf on this vs master (now that the ndarray stuff is merged)? (ideally put the relevant timeits in the cython code, that is a great idea generally)

On the benchmark I've been using for these im seeing about a 4% gain (though the worst performance had a 0.2% slowdown). The big win I'm expecting comes in the next step, which puts BlockManager.get_slice in cython, which allows us to chain together several no-longer-python calls

@jbrockmendel
Copy link
Member Author

rebased + green

@jbrockmendel
Copy link
Member Author

Just measured the branch after this (that implements BlockManager.get_slice in cython) and that gets a 15% improvement.

@jreback jreback added this to the 1.3 milestone Apr 16, 2021
@jreback jreback merged commit 69d7663 into pandas-dev:master Apr 19, 2021
sthagen added a commit to sthagen/pandas-dev-pandas that referenced this pull request Apr 19, 2021
PERF: put BlockManager constructor in cython (pandas-dev#40842)
@jbrockmendel jbrockmendel deleted the cy-mgr-3 branch April 19, 2021 17:29
@jorisvandenbossche
Copy link
Member

On the benchmark I've been using for these im seeing about a 4% gain

As I have mentioned before (eg on the mailing list, in #40263), this benchmark is basically useless / unrealistic.
I think you should benchmark with an actual applied function to check wether it is worth moving things to cython for this.

yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request Apr 21, 2021
yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants