Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data API equivalent for HoloMaps #347

Closed
jlstevens opened this issue Dec 11, 2015 · 4 comments
Closed

Data API equivalent for HoloMaps #347

jlstevens opened this issue Dec 11, 2015 · 4 comments
Labels
type: feature A major new feature
Milestone

Comments

@jlstevens
Copy link
Contributor

We've had the recent Data API PR merged for Chart elements and this issue suggests a similar thing could be implemented for HoloMaps and NdMappings.

The API is not quite the same as for charts as HoloMap is designed to be dictionary like. However, instead of always using ordered dictionaries as .data, you can imagine using pandas DataFrames instead. This could greatly improve the speed of certain operations (such as groupby) involving holomaps.

@jlstevens jlstevens added the type: feature A major new feature label Dec 11, 2015
@philippjfr
Copy link
Member

Note that the slow speed of groupby is the major motivator here. I've just come up with an alternative implementation for the NdMapping groupby that temporarily converts to a DataFrame. Here's the performance profiling for a three-dimensional HoloMap grouped by two of the dimensions with both implementations:

image

I think until we find a way around this we should allow HoloMap to use this implementation when pandas is available.

@philippjfr
Copy link
Member

Just some quick thoughts on this. Fundamentally the distinction between NdMapping and Columns types is in the way the data is indexed. NdMapping types are good for providing multi-dimensional indexing for dense chunks of data, e.g. Elements. Elements on the other hand hold dense chunks of data directly, whether it is columns or dense 2d arrays. Implementing a separate API for NdMapping types that can work using either the current OrderedDict based implementation or using pandas MultiIndexes therefore would make sense.

This second index based baseclass would also be useful for more powerful composite Element types, i.e. you could have a new Element baseclass where the data maps between multi dimensional keys and values that match the Columns data format, e.g. for a list of polygons for each country the data format would be {'Australia': {'x': xs, 'y': ys}, 'Austria': ...}, where {'x': xs, 'y': ys} is a valid definition for an individual Curve. This would be very similar to what NdOverlay provides but without the need to nest the data and provide a much more optimized alternative for storing and plotting collections of artists.

This example from the pandas docs should make it clear what the data format is:

                     A         B         C
first second                              
bar   one     0.895717  0.410835 -1.413681
      two     0.805244  0.813850  1.607920
baz   one    -1.206412  0.132003  1.024180
      two     2.565646 -0.827317  0.569605
foo   one     1.431256 -0.076467  0.875906
      two     1.340309 -1.187678 -2.211372
qux   one    -1.170299  1.130127  0.974466
      two    -0.226169 -1.436737 -2.006747

Here first and second would be the dimensions mapping to the NdMapping style n-dimensional key and A, B and C the dimensions of the columns based format of the values in this new Element type. If we added support for this then we could leverage the general work to improve NdMapping proposed as part of this issue with the work on the data API to get a high-performance Element type for collections of data without some of the hacky workarounds (e.g. padding with delimiters) we discussed to improve the Paths and Polygons types.

@philippjfr
Copy link
Member

I no longer think this is required now that I've optimized groupby with pandas and we are building out DynamicMap support, so I'm going to close this issue.

Copy link

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type: feature A major new feature
Projects
None yet
Development

No branches or pull requests

2 participants