Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in conversion of TH2D with fSumW2 to boost-histogram (uproot4 0.1.2) #198

Closed
HDembinski opened this issue Nov 20, 2020 · 7 comments · Fixed by #196
Closed

Bug in conversion of TH2D with fSumW2 to boost-histogram (uproot4 0.1.2) #198

HDembinski opened this issue Nov 20, 2020 · 7 comments · Fixed by #196
Labels
bug The problem described is something that must be fixed

Comments

@HDembinski
Copy link
Member

I found a bug in the conversion from a TH2D with fSumW2 to a boost-histogram. The variance gets incorrectly transposed. Demo code:

import ROOT
import uproot4

h = ROOT.TH2D("h", "", 2, 0, 1, 2, 0, 1)
h.SetBinContent(1, 1, 1)
h.SetBinContent(1, 2, 2)
h.SetBinContent(2, 1, 3)
h.SetBinContent(2, 2, 4)
h.SetBinError(1, 1, 1)
h.SetBinError(1, 2, 2)
h.SetBinError(2, 1, 3)
h.SetBinError(2, 2, 4)

f = ROOT.TFile.Open("foo.root", "recreate")
h.Write()
f.Close()

with uproot4.open("foo.root") as f:
    h = f["h"]

hb = h.to_boost()

val, err = h.values_errors()
print("values")
print(val)
print(hb.view(True).value)

print("errors")
print(err)
print(hb.view(True).variance ** 0.5)

Returns

values
[[0. 0. 0. 0.]
 [0. 1. 2. 0.]
 [0. 3. 4. 0.]
 [0. 0. 0. 0.]]
[[0. 0. 0. 0.]
 [0. 1. 2. 0.]
 [0. 3. 4. 0.]
 [0. 0. 0. 0.]]
errors
[[0. 0. 0. 0.]
 [0. 1. 2. 0.]
 [0. 3. 4. 0.]
 [0. 0. 0. 0.]]
[[0. 0. 0. 0.]
 [0. 1. 3. 0.]
 [0. 2. 4. 0.]
 [0. 0. 0. 0.]]

The values are correct, but in the errors 2 and 3 are flipped.

@HDembinski
Copy link
Member Author

It is not a simple transpose, of course, the displacement gets more bizarre if h is not a square matrix.

@jpivarski jpivarski added the bug (unverified) The problem described would be a bug, but needs to be triaged label Nov 20, 2020
@jpivarski jpivarski added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Nov 20, 2020
@jpivarski jpivarski linked a pull request Nov 20, 2020 that will close this issue
@jpivarski
Copy link
Member

I'm working on the histogram interface right now (PR #196), so I wasn't sure that this would still be true. Here's my reproducer (using a non-square histogram—unfortunately, all the examples in scikit-hep-testdata are square):

>>> import ROOT
>>> h = ROOT.TH2D("h", "", 5, -10, 10, 7, -28, 28)
>>> h.SetBinContent(1, 1, 1)
>>> h.SetBinContent(1, 2, 2)
>>> h.SetBinContent(2, 1, 3)
>>> h.SetBinContent(2, 2, 4)
>>> h.SetBinError(1, 1, 1)
>>> h.SetBinError(1, 2, 2)
>>> h.SetBinError(2, 1, 3)
>>> h.SetBinError(2, 2, 4)
>>> f = ROOT.TFile.Open("issue-198.root", "recreate")
>>> h.Write()
302
>>> f.Close()

Reading it back in Uproot (PR #196, flow=False is now the default):

>>> import uproot4
>>> h = uproot4.open("issue-198.root:h")
>>> h.values()
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> h.errors()
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> hb = h.to_boost()
>>> hb.view().value
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> hb.view().variance**0.5
array([[0., 0., 0., 0., 0., 2., 4.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

Yup. There it is. This is probably related to #93. After fixing this, I'll check across all the histogram types to be sure the transposition is applied generally.

jpivarski added a commit that referenced this issue Nov 20, 2020
@jpivarski
Copy link
Member

This commit, afe8940, results in:

>>> import uproot4
>>> h = uproot4.open("issue-198.root:h")
>>> h.values()
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> h.errors()
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> hb = h.to_boost()
>>> hb.view().value
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])
>>> hb.view().variance**0.5
array([[1., 2., 0., 0., 0., 0., 0.],
       [3., 4., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0.]])

@jpivarski
Copy link
Member

I also made sure that all output arrays are in native endianness. (I think boost-histogram required that, but it's good for values and errors to return the same thing.)

@HDembinski
Copy link
Member Author

Thank you for the immediate fix! Yes, we require native endianness, but so does ROOT itself, no? The memory buffer in a TH* is always in native endianness.

Also good to hear that flow=False is the default.

@HDembinski
Copy link
Member Author

Should I try out the develop branch to see whether this fixed my problem before we close this?

@jpivarski
Copy link
Member

This and #167 are both linked to #196, so they'll be closed when that gets merged. I'm attempting to get the histogram interface finalized—it would be a generalization of what you and @henryiii are converging to (i.e. additional methods and arguments are allowed).

You can try out the development branch, but be aware that the interface has changed in it.

jpivarski added a commit that referenced this issue Nov 25, 2020
* Move 'edges' to the TAxis and provide accessors to it.

* Uniform implementation of values, errors, variances, and 2-tuple functions.

* I've gone as far as I can without understanding WeightedMean Storage.

* Fixed #198.

* Implemented all the changes @henryiii and I discussed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants