Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: return fortran arrays in original order in python high-level interface #1681

Closed
wants to merge 6 commits into from

Conversation

germasch
Copy link
Contributor

So this is what I originally set out to do:

  • adds a m_OriginalLayout member to core::Variable, which defaults to Unknown, but BP3 and BP4 engines will set it to the original layout the data was written with it.
  • bpls behavior is changed slightly (only): If the data was originally written from Fortran, it'll add the note (transposed) to the shape, but other than that, it still shows the dimensions reversed / dumps the data transposed. (This could be changed in the future to present the data how it was originally written, if that's considered more useful than current behavior)
  • The Fortran HL interface is changed to return arrays in their original index order. Nothing changes for arrays written in row-major, but arrays written as column-major will be read back in their original order. No transposing / rearranging of the data occurs, one just needs to tell numpy what the layout is.

More specifically, the read methods gain an additional order parameter. 'F' means ,interpret the data as column-major (so if it was written by Fortran, it'll show up in its original order. If it was originally row-major, this makes it show up transposed. 'C' means, interpret the data as row-major, which is what current behavior is always. 'K' means, interpret the data as the layout that it was saved as (if known, default row-major).

If no order parameter is provided, it defaults to 'K', which means both data written by C and Fortran will show up in their original order, which is what most people would want. However, it is a change in behavior. The default order, if not provided as an argument, can be easily changed using

import adios2
adios2.file.read_order = 'C'

will make restore current behavior. So while this PR changes behavior, it's at least easy to restore the original behavior if anyone turns out to be affected. I chose this way because the default behavior will avoid confusion, at the cost of potentially causing issues to someone relying on the old behavior. One could prioritize the opposite way, ie., keep old behavior by default and everyone who would like to see their Fortran arrays presented in original order would need to explicitly set read_order to 'K'.

Fixes #1665

@williamfgc williamfgc requested a review from pnorbert August 14, 2019 02:31
@williamfgc
Copy link
Contributor

@germasch overall this looks great. Just made a few comments and I'd like to hear from @pnorbert on the changes to bpls.

Copy link
Contributor

@williamfgc williamfgc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing about the use of static and minor things.

bindings/Python/py11File.cpp Outdated Show resolved Hide resolved
bindings/Python/py11File.cpp Show resolved Hide resolved
bindings/Python/py11File.h Show resolved Hide resolved
bindings/Python/py11File.h Outdated Show resolved Hide resolved
bindings/Python/py11File.h Show resolved Hide resolved
bindings/Python/py11File.tcc Outdated Show resolved Hide resolved
bindings/Python/py11glue.cpp Outdated Show resolved Hide resolved
@germasch
Copy link
Contributor Author

I added some documentation (good point). I've also added a test to show that adios2.File.read_mode works to set the default globally.

I'm now passing the order string by value, and changed the static to an anonymous namespace, and fixed the camelcase variable.

If you would prefer it, I could remove the configurable default for order, and just hardcode order='K', ie, 'K' always being default. It would certainly remove one bit of complexity (which is good), but also the ability to set the default back to the old behavior (always row-major), which personally I wouldn't miss, but someone already using the HL interface might.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 15, 2019

@germasch thanks for addressing all the points. Looks good.

If you would prefer it, I could remove the configurable default for order, and just hardcode order='K', ie, 'K' always being default. It would certainly remove one bit of complexity (which is good), but also the ability to set the default back to the old behavior (always row-major), which personally I wouldn't miss, but someone already using the HL interface might.

Let me discuss with more people and find out more. In the meantime we can remove it and think more about it, while current users are not affected.

@williamfgc
Copy link
Contributor

williamfgc commented Aug 15, 2019

one minor thing, can we have a separate PR for changes in bpls (it's a different topic) just so I can merge the changes in Python once this PR is done. Thanks!

@germasch
Copy link
Contributor Author

one minor thing, can we have a separate PR for changes in bpls (it's a different topic) just so I can merge the changes in Python once this PR is done. Thanks!

I'll drop he bpls changes from this PR.

@germasch
Copy link
Contributor Author

If you would prefer it, I could remove the configurable default for order, and just hardcode order='K', ie, 'K' always being default. It would certainly remove one bit of complexity (which is good), but also the ability to set the default back to the old behavior (always row-major), which personally I wouldn't miss, but someone already using the HL interface might.

Let me discuss with more people and find out more. In the meantime we can remove it and think more about it, while current users are not affected.

The thing is, the way things are right now, people are (potentially) affected. The old behavior is order='C'. If I remove the File.read_order thing and just hardcode order='K' as default argument, the behavior when reading Fortran files will be different (arguably, it will be "correct" now, as the array will show up with its original indexing, but people's definition of "correct" may vary. I'd say, it's definitely less confusing to a new user when indices don't reverse.)

I think that's generally a difficult situation -- one wants to change behavior, because it turns out that some old behavior is not really how it should be. There are basically a few options:

  1. Stick with the old behavior, at least by default. The good thing about that is that one doesn't break any existing users. The disadvantage is that the confusing behavior gets carried forward indefinitely, and as more people use the API, the harder it becomes to change.
  2. Introduce a new API. Like, read2, which changes behavior to how it should have been. That works, but it's ugly (and also kinda confusing).
  3. Fix the "buggy" behavior. That's the best solution going forward, only one API, without the confusion. But it breaks code that has to become to rely on the "buggy" behavior.

My suggestion on this (and what the current PR does) is to follow the 3rd option. To limit the breakage (if any), one can put into the release notes:

  • adios2.File.read() now honors the original data layout that the data was written with, that is, Fortran arrays will be read in their original indexing order, as opposed to previous behavior, which reverse dimensions. In order to restore original behavior, set adios2.File.read_mode = 'F' after importing adios2.

I really dislike option 2. Option 1 wouldn't be unreasonable, though I suspect it won't really fix the issue that this PR is trying to address, since people don't tend to read documentation in much detail:

Someone doing x = fh.read('x') and noticing that the dimension are reversed from their original data is likely to think "Hmmh, looks like adios2 sucks just as bad h5py in that it reverses the dimensions of my array just because I dare using Fortran...", and somehow work around it themselves. Not many people will look through the documentation and notice that if they used x = fh.read('x', order='C') things will start behaving as they should.

@germasch germasch force-pushed the pr/layout branch 2 times, most recently from dea9f90 to 69d3736 Compare August 16, 2019 03:12
@williamfgc
Copy link
Contributor

williamfgc commented Aug 16, 2019

@germasch I agree with your points. I just want to hear from other stakeholders before changing current behavior that could affect their current workflows.

@germasch
Copy link
Contributor Author

@williamfgc Sounds good to me. I dropped the bpls related changes from this PR.

@NAThompson
Copy link
Contributor

NAThompson commented Aug 22, 2019

@germasch : Would it be sensible to add a few words about this feature into the ADIOS2/docs/user_guide files? This seems like a very useful feature and I don't want it to be buried.

More specifically, the read methods gain an additional order parameter. 'F' means ,interpret the data as column-major (so if it was written by Fortran, it'll show up in its original order. If it was originally row-major, this makes it show up transposed. 'C' means, interpret the data as row-major, which is what current behavior is always. 'K' means, interpret the data as the layout that it was saved as (if known, default row-major).

Why don't we add the ordering as a metadata parameter in bp3 and bp4? Then we simply interpret the data in the correct order without adding a user parameter.

Another question is whether we could do an in-place matrix transpose. Of course we'd need some performance numbers, but assuming higher RAM->CPU bandwidth than Disk->Ram, we should be able to do a matrix transpose on a lower timescale than reading the data. Assuming the performance numbers are reasonable, this seems like a very robust solution to me; no user parameters are required to do the right thing.

@germasch
Copy link
Contributor Author

@germasch : Would it be sensible to add a few words about this feature into the ADIOS2/docs/user_guide files? This seems like a very useful feature and I don't want it to be buried.

This PR adds a bit of documentation, on the order parameter. I didn't feel the need to write too much explaining the new behavior, since it's really what I'd consider the natural behavior in the first place, ie, you access the data in the same index order that you used when writing it.

Why don't we add the ordering as a metadata parameter in bp3 and bp4? Then we simply interpret the data in the correct order without adding a user parameter.

Well, it's difficult to change existing file formats, though I did suggest to add the information into the metadata in future incarnations. For now the info is on ordering is global per file, though one probably could improve on that in a backwards compatible way.

Another question is whether we could do an in-place matrix transpose. Of course we'd need some performance numbers, but assuming higher RAM->CPU bandwidth than Disk->Ram, we should be able to do a matrix transpose on a lower timescale than reading the data. Assuming the performance numbers are reasonable, this seems like a very robust solution to me; no user parameters are required to do the right thing.

I think one can argue either way on this. People here reasonably say they want to focus on performance, and obviously a transpose op may not be cheap, in particular, if it happens on data which is already in memory (e.g., because it's using one of in-memory engines). I don't think it'd necessarily be a bad idea, as long as the user explicitly says "I know what I'm doing". But modern languages like python or c++ (and in some sense even C) have the ability to handle different layouts on the fly, and I'd say that's preferable if available, so that's the low-hanging fruit I'm going for (specifically, I'm thinking from the point of view of my codes, and what I need for them.)

@NAThompson
Copy link
Contributor

Ok, so as I understand it:

  • The abstraction of .bp3 as being self-describing is broken, because array ordering is ill-defined. There is currently no mitigation.
  • This PR accepts that the abstraction is broken, but provides mitigation. As such, it is a clear improvement over the status quo, and should be accepted if no other solutions can be provided on a reasonable timescale.

However, let me float an alternative: Fortran writes and reads arrays in C ordering. In this case, we have the context that we are about the perform an expensive disk operation (though the in-memory stores are an important caveat as you mention), and hence the question is whether this operation is expensive relative to the disk ops. I suspect it's not (but haven't written a benchmark yet, so grain of salt), but my tolerance for expensive operations is rather high, because in my view the truly expensive operations are the ones that are performed between your ears when an abstraction leaks.

Now, I don't want to make the perfect the enemy of the good, but the abstraction still leaks: The format is not self-describing. Hence every user is going to hit this same bug, and debug a couple days, and finally find your mitigation-which they'll appreciate! But better that they not hit it in the first place.

Finally, I'm not convinced that the total cost of Fortran performing the transpose is positive. It is amortized over the reads. For instance, your bug report shows you wrote Fortran, and then used Python for reading and subsequent analysis. This seems like a common pattern for many users. Hence if the data is simply written in C ordering, most of the operations are done correctly.

So @pnorbert @germasch @williamfgc is this a sensible solution?

@germasch
Copy link
Contributor Author

  • The abstraction of .bp3 as being self-describing is broken, because array ordering is ill-defined. There is currently no mitigation.

I don't think the file format is broken for the common case. That is, in my area of science, Fortran is clearly dominant, and always uses column-major. When writing from Fortran, it is correctly recorded that the data is column major.

Adios2 is also doing the right thing when writing row-major data from C/C++. However, there are people (plenty, I suspect, but definitely me), who write column-major data from C/C++, and unless they transpose the data themselves, there is no way to get it onto disk with the correct metadata. That's a separate issue to the one here (it can be addressed to some extent, and I'm planning to do that). I think when writing column-major data from Python, Python will do a transpose for you to row-major, so that's expensive, but what ends up on disk is correctly described.

What this PR addresses is the reading side, in particular, actually using the available information about the storage layout. When using a C/C++ tool like bpls on data written by Fortran, it will present the data with reversed index order, and hence transposed. This PR added a patch to bpls to just at least add a note about the presented data being transposed in this case, but I took that part out as requested.

So this PR only addresses the reading side of the Python HL interface where it uses the original layout information from the metadata to create the Python/numpy array with the original ordering, so that both row-major and column-major array show up in their natural order. On the read side with Python, a transpose is never required, because it supports arbitrary strided layouts. That is also true in C++ if one uses a multi-d array class that supports both data layouts (which I think are most or all of the popular ones). But currently on the C++ side, the information is not available (I'm planning to change that).

The only place where transposes really would be required is if writing both row-major and col-major into the same file, because there is no way to mark individual datasets with separate layouts, so one would have to settle on one or the other and transpose the other data. I'm not sure that's a real life issue, so I don't really have any plans about those. Everything else can be addressed within the current framework, and I'm planning to do that -- this is the first step.

Finally, I'm not convinced that the total cost of Fortran performing the transpose is positive. It is amortized over the reads. For instance, your bug report shows you wrote Fortran, and then used Python for reading and subsequent analysis. This seems like a common pattern for many users. Hence if the data is simply written in C ordering, most of the operations are done correctly.

Fortran would only have to perform a transpose if reading data that was originally in row-major format (and where the code can't instead just reverse the indices). I suspect that's a very uncommon case. The issue is much more about generic visualization / analysis tools which tend to be written in C/C++/Python, and which will work better if the index order is not reversed when reading column-major data.

So @pnorbert @germasch @williamfgc is this a sensible solution?

@pnorbert
Copy link
Contributor

Kai, can you describe the current state of this PR? I am trying to run the tutorial's heat2d example, Fortran output, python reader. The script will see the array in one ordering and make the selection in that, but the underlying C++ code will complain about out of bounds in the other ordering (it is confusing to me which sees in which order). Then, if I try to add adios2.file.read_order = 'C' to the script, I get a complaint about adios2.file not defined. So how am I supposed to make this example work?

adios@adiosVM:~/work/adiosvm/Tutorial/heat2d/fortran$ mpirun -n 12 simulation/heatSimulation_adios2 heat.bp  4 3    256 256   6 500
Process decomposition  : 4 x 3
Array size per process : 256 x 256
Number of output steps : 6
Iterations per step    : 500
Simulation step    0: initialization
Using BP4Writer       engine for output
Simulation step    1
Simulation step    2
Simulation step    3
Simulation step    4
Simulation step    5
Total runtime =     3.952s

adios@adiosVM:~/work/adiosvm/Tutorial/heat2d/fortran$ mpirun -n 1 analysis/heatAnalysis_adios2_file  heat.bp
 Input file: heat.bp
 Using BP4Reader engine for input
 Global array size: 1024x768
 Available steps      = 6
 ...

adios@adiosVM:~/work/adiosvm/Tutorial/heat2d/fortran$ bpls -l heat.bp/
  double   T     6*{768, 1024} = 0 / 200

adios@adiosVM:~/work/adiosvm/Tutorial/heat2d/fortran$ ../python/heat_plot.py -i heat.bpAvailable variables: 
variable_name: T
	SingleValue: false
	AvailableStepsCount: 6
	Shape: 768, 1024
	Max: 200
	Type: double
	Min: 0

Traceback (most recent call last):
  File "../python/heat_plot.py", line 93, in <module>
    data = fr.read(args.varname, start, size)
ValueError: ERROR: selection Start Dims(2):[0, 0] and Count Dims(2):[1024, 768] (requested) is out of bounds of (available) Shape Dims(2):[768, 1024] , when reading global array variable T, in call to Get

After adding adios2.file.read_order = 'C':

adios@adiosVM:~/work/adiosvm/Tutorial/heat2d/fortran$ ../python/heat_plot.py -i heat.bp
Traceback (most recent call last):
  File "../python/heat_plot.py", line 82, in <module>
    adios2.file.read_order = 'C'
AttributeError: module 'adios2' has no attribute 'file'

The shape of the variable in Python is calculated from available_variables()

        var = fp.available_variables()
        print("Available variables: ")
        for name, info in var.items():
            print("variable_name: " + name)
            for key, value in info.items():
                print("\t" + key + ": " + value)
            print("\n")
    
        dshape = var[args.varname]['Shape'].split(',')
        for i in range(len(dshape)):
            datashape[i] = int(dshape[i])

So maybe you did not take care of this function? How do you know what is the shape of the variable before you read?

@germasch
Copy link
Contributor Author

Kai, can you describe the current state of this PR?

Well, I was thinking it was ready to be merged, but thanks to your review, there's some work to be done now ;)

I am trying to run the tutorial's heat2d example, Fortran output, python reader. The script will see the array in one ordering and make the selection in that, but the underlying C++ code will complain about out of bounds in the other ordering (it is confusing to me which sees in which order). Then, if I try to add adios2.file.read_order = 'C' to the script, I get a complaint about adios2.file not defined. So how am I supposed to make this example work?

I made a typo in the description above, it should be adios2.File.read_order -- the File class is pre-existing, and it's upper-case. So using that a least you should easily get the original behavior back.

So ideally, the Python interface (and really any interface) should stick with just one index order. I knew that I didn't change anything about writing yet (which I think is relatively orthogonal), but I missed the shape in available_variables(). I did otherwise hopefully catch all the other start, count, etc instances and made those consistent with the ordering presented. I will fix that, and hopefully there's not too much more I missed.

So maybe you did not take care of this function? How do you know what is the shape of the variable before you read?

Well, I guess I haven't hit the case where I need to know the shape beforehand (because I already generally know things about my simulation in my use). But FWIW what h5py does use is
var = file["/my/var"]; print(var.shape), which is actually a very nice way of doing it. (And while it looks like that'd read the whole dataset, it actually doesn't.)

@chuckatkins
Copy link
Contributor

chuckatkins commented Aug 27, 2019

Over all, super great to have this feature, so thanks.

I know it's not the most popular opinion and I may be dismissed by others here but I would strongly be in favor of using the actual data layout in the public API, i.e. RowMajor ColMajor, and move away from a language based label, i.e. C or Fortran, as much as possible. Fortran is pretty much always ColMajor but equating C ordering to RowMajor as a nomenclature can be ambiguous and sometimes just wrong. It's certainly common practice to equate the two but especially in the world of HPC with the intermixing of C, C++, Fortran, Python, etc., not to mention GPU data layouts, "C" based codes often intertwine their data ordering and use ColMajor, RowMajor, or both

I do really like having the data layout known in the variable. Maybe I'm not seeing it but what's the use case for Unknown and Original in the ::Layout enum? It's never really unknown since adios has to use something to base it's index calculations off of.

@pnorbert
Copy link
Contributor

pnorbert commented Aug 27, 2019 via email

@germasch
Copy link
Contributor Author

I know it's not the most popular opinion and I may be dismissed by others here but I would strongly be in favor of using the actual data layout in the public API, i.e. RowMajor ColMajor, and move away from a language based label, i.e. C or Fortran, as much as possible. Fortran is pretty much always ColMajor but equating C ordering to RowMajor as a nomenclature can be ambiguous and sometimes just wrong. It's certainly common practice to equate the two but especially in the world of HPC with the intermixing of C, C++, Fortran, Python, etc., not to mention GPU data layouts, "C" based codes often intertwine their data ordering and use ColMajor, RowMajor, or both

Agree 100%, exactly my point.

I do really like having the data layout known in the variable. Maybe I'm not seeing it but what's the use case for Unknown and Original in the ::Layout enum? It's never really unknown since adios has to use something to base it's index calculations off of.

Original is for the API only, so one can see, e.g., in python, "I want the array in its original order" (though in python that's done by saying order='K', to be consistent with numpy.

Unknown is meant to be as the answer to GetOriginalLayout() when it's not known. BP3/BP4 (kinda) know the answer. But HDF5 really doesn't know whether the data is row-major, because it was written by C, or if it just pretends to be row-major because it was originally col-major but the user reversed the dims when writing the Fortran data. The actual layout, of course, is known in HDF5, it's always row-major. I guess one can argue whether one should just return RowMajor for the original layout, too. For now, Unknown is also used as a way to ease the transition, ie, it's not necessary to touch every engine, as Unknown is used as default.

@germasch
Copy link
Contributor Author

@pnorbert, can you try whether the latest commit fixes your shape-related issue?

@germasch germasch force-pushed the pr/layout branch 2 times, most recently from 6515c95 to 71fcf29 Compare August 27, 2019 17:54
@pnorbert pnorbert changed the title return fortran arrays in original order in python high-level interface WIP: return fortran arrays in original order in python high-level interface Aug 27, 2019
Only works for BP3 and BP4 for now, will default to Layout::Unknown
otherwise. This function is not exposed in the public API as of now, 
it'll only be used by the python interface and bpls for now.
An additional, optional, `order` argument to read will determine how
the data that is being read is represented as a numpy array. Options
are 'C', 'F', 'K', which are row-major, column-major and original layout,
respectively. If original layout is not known, 'C' will be used, which
is previous behavior.

If the order argument is not provided, the read will default to
`adios2.File.read_order`, which itself defaults to 'K', but user code
can override that default.
This is kinda tricky -- the goal is the user is able to work with the
data without knowing it's original storage order, and this achieves
that, though it requires a custom stride calculation.
This piece was missing to provide apps a consistent view of a variable
in its original data layout / ordering.

Adds a test for it, too. The implementation isn't pretty...
@chuckatkins
Copy link
Contributor

@germasch, The functionality and features implemented in here are great. Given it's significance to the API though it's really something that needs to be implemented within the core and then propagated to all the APIs bindings rather than being implemented just within the python bindings. While there's currently not complete feature parity across all the APIs, there is for the most part. A change like this in particular though would need to be available throughout. Given the disruption of such a change internally and the change to all the public APIs we've decided to push such an implementation to after the end-of-september release. We do have developer resources on the team to put towards implementing it though but the timeline will be pushed back a bit.

@germasch
Copy link
Contributor Author

Actually, I don't mind dropping this PR for the time being. Maybe not quite for the same reason -- the core functionality needed for the aspect covered here is a very minor addition, but I agree that it leave other aspects to-be-done for now, like HL writing and the low-level Python interface entirely. I wouldn't think having this only available in Python to start is too bad, since Python is the only language that natively supports different layouts. In fact, I think one can learn from it what a modern API for the other bindings might look like.

Anyway, the major reason I agree that this PR may not be the best idea is that it either

  • introduces an incompatible change in behavior (with the current default of 'K')
  • or, if one were to change the default to 'C', compatibility would be maintained, but it'd also be very unlikely to be of much benefit, except for experts that know to change the default.

Not really directly related, but the python HL API is not very pythonic at all. So there is an opportunity to introduce a more pythonic API and at the same time introduce new, more natural handling of layouts with that.

@germasch germasch closed this Aug 29, 2019
@williamfgc
Copy link
Contributor

Not really directly related, but the python HL API is not very pythonic at all.

@germasch feel free to propose more pythonic interfaces. Personally, I am not a Python experienced user. Most of the API design came from balancing the input from Python practitioners and what pybind11 v2.0 could support at the time for Python 2 and 3, which feels ancient today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reading Fortran data using python interface
5 participants