-
Notifications
You must be signed in to change notification settings - Fork 64
Error reading STL vectors inside TClonesArrays #374
Comments
Quick question: is this reproducer an example of an "in-principle problem" or a simplified version of a problem you're actually facing in physics analysis? The reason I ask is because uproot's main mode of use is on simple data, in which the processing can be entirely in Numpy and it can be quick. Some complex data structures are handled: the infrastructure is available for any complex data, but many of these structures defy the general rule and have to be handled specially. (There's already special logic in there to handle (Maintaining uproot is something I do in addition to my normal work, so I have to find a balance somewhere.) |
This is for my day to day work, yes. In this case I'm trying to read the track parameters and the reco hits from which this track came to see if the track fit has worked correctly. Then I need the residuals to perform software alignment for our sensor modules. I understand that this is not a trivial task and I really appreciate all the work you're doing. If you see an easy way to access the |
I've looked into the vector-in-TClonesArray and it's surprising: the import uproot, struct, numpy
tree = uproot.open("issue374.root")["asdf"]
branch = tree["myarr.foobars"]
class B(uproot.rootio.ROOTStreamedObject):
_n_format = struct.Struct(">i")
_foo_dtype = numpy.dtype(">f8")
_bar_dtype = numpy.dtype(">i4")
@classmethod
def _readinto(cls, self, source, cursor, context, parent):
start, cnt, self._classversion = uproot.rootio._startcheck(source, cursor)
cursor.skip(6) # the std::vector header
self.n = cursor.field(source, self._n_format) # the std::vector size
self.foo = cursor.array(source, self.n, self._foo_dtype)
self.bar = cursor.array(source, self.n, self._bar_dtype)
uproot.rootio._endcheck(start, cursor, cnt)
return self
def __repr__(self):
return "<B n={0} foo={1} bar={2}>".format(
self.n, self.foo.tolist(), self.bar.tolist())
interp = uproot.asgenobj(B, branch._context, skipbytes=0)
branch.array(interp) The output of the last line is
(for a file that I made with three One aspect of this serialization that I've used (and didn't have to) is the fact that it gives a cross-check on the object size. This is common for objects outside of TTrees, so I'm using the class B(uproot.rootio.ROOTStreamedObject):
_n_format = struct.Struct(">i")
_foo_dtype = numpy.dtype(">f8")
_bar_dtype = numpy.dtype(">i4")
@classmethod
def _readinto(cls, self, source, cursor, context, parent):
self.n = cursor.field(source, self._n_format)
self.foo = cursor.array(source, self.n, self._foo_dtype)
self.bar = cursor.array(source, self.n, self._bar_dtype)
return self
def __repr__(self):
return "<B n={0} foo={1} bar={2}>".format(
self.n, self.foo.tolist(), self.bar.tolist())
interp = uproot.asgenobj(B, branch._context, skipbytes=12) This gives the same output, does less checking, and may be a little bit faster. I know that your real data doesn't have If you want to turn this into a jagged array, see the Good luck! |
As it turns out, the error above is because I was unaware of ROOT's "memberwise splitting," and (if I said anything to the contrary above), it has nothing to do with Boost serialization. This same error came up in 6 different issues, so further discussion on it will be consolidated into scikit-hep/uproot5#38. (This comment is a form message I'm writing on all 6 issues.) As of PR scikit-hep/uproot5#87, we can now detect such cases, so at least we'll raise a |
Hi,
this issue is an addition or refinement to #373, but we think we found the root cause. It appears uproot crashes when it tries to access a vector inside a struct/class which was written to a TClonesArray. My colleague wrote a minimum example using only ROOT and uproot, not boost whatsoever.
rootFileTest.C
:readTree.py
Compile and run with:
The file
output.root
is now created and can be read with ROOTsTBrowser
:However,
uproot
cannot read this branch:The text was updated successfully, but these errors were encountered: