Skip to content
This repository has been archived by the owner on Jun 21, 2022. It is now read-only.

How can I follow a TRef? #324

Closed
HenryDayHall opened this issue Aug 30, 2019 · 6 comments
Closed

How can I follow a TRef? #324

HenryDayHall opened this issue Aug 30, 2019 · 6 comments

Comments

@HenryDayHall
Copy link

Apologies for the naive question, I cannot seem to grasp how to follow a TRef to find the component it points to.

For example, I have a root file created with delphes (detector simulator). It has particles (MC truth objects), tracks and towers (observations from the simulated detector). The tracks and towers keep a TRef that points to the particle that created them. I want to follow this TRef and learn the index of the particle that made them. Getting the particle itself would probably be enough to figure out it's index (with some fuzzy comparisons).

The only method I can see on the TRef object is tref.read(source, cursor, context, parent) which has no doc-string available.

How should I follow a TRef in uproot?

@jpivarski
Copy link
Member

Actually, I've never thought about how to follow a TRef, so in that sense, the feature does not exist in uproot. However, if you give me the file and point out where the TRef is that you're interested in, I'll see if it can easily be added. I'm sure the TRef contains a number, but the question is how to interpret that number. It would also help if you could tell me what that TRef is supposed to point to, which would help me figure out what relationship that number has with the object it references.

@HenryDayHall
Copy link
Author

HenryDayHall commented Aug 30, 2019

Wow, that would be very kind of you.
https://mega.nz/#F!afB2SSwC!3C8bp5xY_d01VvXmkh1nJQ
So there are two files on the other end of the link (hosted elsewhere because I cannot upload .root files here).

Inside the root file there are TRef at "Delphes"->"Track"->"Track.Particle"
These should point to parts of "Delphes"->"Particle", indicating which particle created each track
For example in event 0 the track 0 points to particle 751.

event_num | track_num | particle_num
0 | 0 | 751
0 | 1 | 765
0 | 2 | 779
0 | 3 | 782
0 | 4 | 797

The other file there is an SQLite database that I created from the root file... because I couldn't find anything that could directly read TRef's in python3. ExRootAnalysis (part of Delphes) was used to make it in c++. The database may be a bit imperfect because it was constructed with a lot of fuzzy equalities, but it should be 99.9% accurate. Including it because there are lots of GUI SQL browsers that you could use if you wanted a quick look at the relations.

@jpivarski
Copy link
Member

From your file, I implemented TRef as an object that can live in TTrees. The patch is in PR #326.

Here's an example of how to use them:

import uproot
t = uproot.open("issue324.root")["Delphes"]
refs = t["Track.Particle"].array()
refs
# <JaggedArray [
#     [<TRef 752> <TRef 766> <TRef 780> ... <TRef 1813> <TRef 1367> <TRef 1666>]
#     ...
#     [<TRef 745> <TRef 762> <TRef 783> ... <TRef 1863> <TRef 1713> <TRef 1717>]]>

These TRef objects each have an id:

refs[0][0].id
# 752

and following the normal rules of jagged arrays, you can ask for all of those ids in a single swipe:

refs.id
# <JaggedArray [
#      [752 766 780 ... 1813 1367 1666]
#      ...
#      [745 762 783 ... 1863 1713 1717]]>

Also following the normal rules of jagged arrays, you can pass them into a particle's attributes to pick the particles they refer to, except that these references start at 1 instead of starting at 0.

pt = t["Particle.PT"].array()
pt[refs.id - 1]
# <JaggedArray [
#      [0.7637838 1.1044897 5.463864 ... 4.252923 1.9702696 9.213475]
#      ...
#      [1.2523094 0.37887865 0.7390242 ... 1.0288503 3.4785874 1.804613]]>

We can see this by looking at the particles' fUniqueID:

t["Particle.fUniqueID"].array()
# <JaggedArray [
#      [1 2 3 ... 1811 1812 1813]
#      ...
#      [1 2 3 ... 1871 1872 1873]]>

and verify that they are strictly off-by-one (in each event, for all events):

(t["Particle.fUniqueID"].array() - 1 ==
 t["Particle.fUniqueID"].array().localindex
 ).all().all()
# True

I don't know if it is always true that the fUniqueIDs are strictly counting upward, starting at 1. In the general case, you'd have to inverse the lookup (i.e. "for which particle is the ID equal to REF?"), but as long as you're in this case, you can do forward indexing (i.e. "get the particle at position REF - 1").

@jpivarski
Copy link
Member

This is now uproot 3.9.1.

@cshimmin
Copy link

Hi,
Sorry for necromancing but I just wanted to address your comment here, since I was stuck on it for a bit and it may be useful to someone else:

I don't know if it is always true that the fUniqueIDs are strictly counting upward, starting at 1. In the general case, you'd have to inverse the lookup (i.e. "for which particle is the ID equal to REF?"), but as long as you're in this case, you can do forward indexing (i.e. "get the particle at position REF - 1").

It is not always the case, in fact it's just a lucky coincidence that is specific to a particular configuration of Delphes. In general, TObjects are initialized with fUniqueID=-1. Any time TObject is added to TRefArray, it is assigned an fUniqueID, unless it already has a nonnegative one. Root maintains a static counter state via TProcessID and by default will increment the next fUniqueID every time a new one is issued. This is supposed to be unique at the process level, which is why a pid-dependent UID called "ProcessID0" is written to any root file containing a TRefArray.

Delphes overrides this behavior by resetting the counter at the beginning of every event it processes. It also as a global Factory object that preemptively assigns fUniqueIDs to any objects created that could potentially be written to file (whether or not they are ultimately referenced by a TRefArray or written out).

The lucky coincidence is that in the default configuration, usually the references to particles are references to the event-level collection Delphes/stableParticles, which happen to be the very first objects instantiated by the DelphesFactory, and are stored in the same order they are created.

However, if you were to, say, filter out some of these particles before writing them out, they fUniqueIDs would have no particular relationship with their TObjectArray indices in the file.

What's worse (and actually the genesis of issue #513), is that it's impossible to use this method to obtain a list of Tower objects from jets. Since Tower objects are created much later on during the event, they have large, random fUniqueID values, and something like Jet.Constituents will be a TRefArray containing those values. Instead, you need to go through the IDs in the TRefArray and pick out the corresponding objects from whatever array contains the object.

@jpivarski
Copy link
Member

@cshimmin Thank you for writing this! Working on your issue got me thinking, "Maybe that's what those fUniqueIDs are for, after all: TRefs," and you confirmed it. I might need to formalize this connection with an Awkward IndexedArray, mapping TRef values to fUniqueID values, because otherwise I've been hiding the fUniqueIDs along with fBits. (Most of the time, users want this: half of the data in a TLorentzArray is in these two fields, but it only matters when those objects are being referred to by TRefs.

Or maybe it should be an option (keep_uniqueids)? Something like that.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants