Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimensional Objects, sampling, splitting and collapsing #2

Merged
merged 43 commits into from
May 31, 2014
Merged

Dimensional Objects, sampling, splitting and collapsing #2

merged 43 commits into from
May 31, 2014

Conversation

philippjfr
Copy link
Member

This pull request is not to be merged immediately but the core of everything already works, i.e. sampling is implemented within this new modular framework. This framework will eventually make Views easily sampleable, collapseable and convertible in addition to being sliceable and deepindexable.

The Concept

  • The core of this pull request is factoring all convenience methods for dealing with Dimensions on an NdMapping into a mix-in class called Dimensional. This mix-in class is then used not only by NdMapping as before but also by Views eliminating the xlabel, ylabel and cyclic_range parameters and adding string unit support to the dimensions.
  • Additionally most Views now allow a dimension to be set as .label, if a string is supplied it turned into a dimension and accessible via ._label_dim, this attribute should be made public and renamed (suggestions welcome).

More importantly this puts us on a path of massively extending the power of DataViews by going beyond the two operations we already know about:

  • sliceable: We are already familiar with this feature of NdMappings, it means that any dimension your atomic View elements are indexed by, reducing the amount of data you want to look at in any one chunk.
  • deepindexable: This has proven to be useful if you want to focus on a particular part of your data. However, so far it was sort mysterious because it was hard to tell what you were actually slicing into. I've added a .deep_dimensions property which descends to the atomic View and reports all dimensions that are sliceable.

To a number of even more useful properties of DataViews:

  • sampleable: My crazy sampling ViewOperation was the first step towards this but it was impenetrable because it was all happening in one place. I have now implemented methods to break this into simpler and more easily understood chunks. The Stack is first .split by the supplied sample dimension, resulting in nested stacks. Once the stack is split, .sample is then called on each of the split out Stacks, and the sample coordinates are passed to the Views, which return ScatterPoints, which are then combined by passing them to the Curve constructor. Overlaying is handled by a separate method called .overlay_dimensions. This also allows any number of dimensions to be sampled, i.e. you can sample at a particular X coordinate on a SheetView to get a Curve of orientation preference varying by x.
  • collapseable: Very similar concept to sampling but instead of sampling along a particular dimension/axis you collapse it using a supplied function like np.mean, np.std.
  • convertible: Views of equal dimensionality should be convertible, it's perfectly valid to go from ScatterPoints to a Curve or from a curve to a bar chart same goes for SheetViews -> SurfaceView. I suggest and have started implementing accepting different View and Stacks in the constructor, in most cases it is a simple matter of copying .data and .get_param_values(), while np.concatenate will deal with assembling ScatterPoints into a Curve or in fact dense Curves into a sheet (although not necessarily a SheetView) .

The Implementation

Note: * indicates what is not yet implemented

View/Stack Types of different dimensions [indicates shape or type of data]


  • Categorical: TableView, MapView [{k: v}]
  • 1D: DataLayer/Stack: Curve, ScatterPoints, Histogram [np.zeros(x, 2)_], Bars_ [edges, values]
  • 2D: PhaseStack, SheetLayer/Stack: SheetView, SurfaceView* [np.zeros(x,y)],
  • 2D (Sparse): PhaseSpace* [np.zeros(x, y, 3)]
  • 3D: VolumetricLayer/Stack: VolumetricView* [np.zeros(x, y, z)]

Stack Methods


  • .split_dimensions([dim_list]) - Basic operation underlying almost every operation, splits a Stack of multiple dimensions into a nested NdMapping containing the original Stack types
  • .sample(dimsample_map, [, sample_dim]) - Reduces View by sampling the specified dimensions and then combines them if sample_dim is supplied (passing the new dimension and value to the View, looking up the correct type in a dimensional conversion map and passing the split stack to the new View type)
  • .collapse(dimfn_map, [, sample_dim])* - Same as .sample but reduces View dimension using a supplied function
  • overlay_dimensions(dim_list) - As the name suggests it will split out the supplied dimensions and then iterate over them overlaying them using the * operator.
  • .split_overlays() - Splits the overlays in the stack into separate Stacks
  • .flatten()* - Flattens an NdMapping containing Stacks into higher-dimensionality Stacks.

Data/Sheet/TableStack Methods


  • .collapse_stack(dimfn_map, )* - Collapses dimensions in the Stack using the supplied {Dimension: Function} pairs, has to be implemented for a Stack of every dimensionality as only they know how to combine .data on the Views. May be better to implement these as StackOperations.

View Methods (implemented on the dimensionality subclass, i.e. currently on DataLayer and SheetView)


  • .sample(dimsample_map, [,add_dimension]) - Reduces dimensionality of the View using the supplied sample. By supplying optional new {Dimension: value} pairs reducing View dimensionality can be avoided (useful for Stack sampling)
  • .collapse(dimfn_map, [,add_dimension]) - Reduces dimensionality using the supplied collapse fn, which is applied along the right axis. add_dimension behaves the same as for .sample
  • .getitem(slices) - Maintains the dimensionality of the data, i.e. SheetViews can be sliced but stay SheetViews, DataLayers may change type but stay the same dimensionality, i.e. a Curve indexed by an xvalue becomes a ScatterPoints object, which internally still has shape (x, 2)
  • .init(data) - Data can be data of the right dimensionality, a View of the right dimensionality or a list or stack of Views, which can simply be concatenated

Further notes:


  • .dense property to check whether View can safely be converted to other types of Views
  • A ViewOp to resample sparsely sampled points or phase space into a Surface/SheetView and Curve respectively, would use scipy.interpolate.griddata
  • =3D Views may want to implement split methods to split the data into Stacks of lower dimensional data

The timeline

Basically this pull request will get us to a stage where we can leave it and not worry about having to massively change the API or the datastructures later. The real issue is that right now, we won't be able to work on this any more. However, we don't want to do all of our science based on a system that will completely change in a few months time. Changing over to the Dimensional object means that everything is in place to get all of the above working in time. The system is modular such that we can implement the missing methods as we need them until eventually every single dimension is truly sampleable, collapseable, and convertible.

@@ -66,6 +69,30 @@ def __mul__(self, other):
roi_bounds=roi_bounds)


def dimension_values(self, dim_index):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not dim_label instead of dim_index?

jlstevens added a commit that referenced this pull request May 31, 2014
Dimensional Objects, sampling, splitting and collapsing

Merging now as the PR is supposed to be 100% up-to-date  - we will have a few things to tidy up to get the tests working shortly.
@jlstevens jlstevens merged commit 0e77366 into holoviz:master May 31, 2014
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants