Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: HistStack for #169 #244

Merged
merged 20 commits into from
Jul 7, 2021
Merged

feat: HistStack for #169 #244

merged 20 commits into from
Jul 7, 2021

Conversation

LovelyBuggies
Copy link
Collaborator

@LovelyBuggies LovelyBuggies commented Jun 28, 2021

To solve #169.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jun 29, 2021

Sourcery Code Quality Report

❌  Merging this PR will decrease code quality in the affected files by 0.05%.

Quality metrics Before After Change
Complexity 0.67 ⭐ 0.67 ⭐ 0.00
Method Length 27.00 ⭐ 28.00 ⭐ 1.00 👎
Working memory 3.89 ⭐ 3.89 ⭐ 0.00
Quality 92.16% 92.11% -0.05% 👎
Other metrics Before After Change
Lines 38 40 2
Changed files Quality Before Quality After Quality Change
src/hist/init.py 92.16% ⭐ 92.11% ⭐ -0.05% 👎

Here are some functions in these files that still need a tune-up:

File Function Complexity Length Working Memory Quality Recommendation

Legend and Explanation

The emojis denote the absolute quality of the code:

  • ⭐ excellent
  • 🙂 good
  • 😞 poor
  • ⛔ very poor

The 👍 and 👎 indicate whether the quality has improved or gotten worse with this pull request.


Please see our documentation here for details on how these metrics are calculated.

We are actively working on this report - lots more documentation and extra metrics to come!

Help us improve this quality report!

@LovelyBuggies
Copy link
Collaborator Author

Maybe with more plotting options here. Any suggestions?

@LovelyBuggies LovelyBuggies marked this pull request as ready for review June 29, 2021 03:28
@LovelyBuggies LovelyBuggies force-pushed the nino/feat/stack-hist branch from 1613223 to 5851c1e Compare June 30, 2021 02:56
@henryiii
Copy link
Member

henryiii commented Jul 1, 2021

h =  Hist.new.Reg(..., name="x").StrCat(["one", "two"] name="y").Double()

s = h.stack("y")
s = Stack({"one": h[:, "one"], "two":h[:, "two"]})

# Another way?
h1 = h[:, "one"]
h1.name = "one"
h2 = h[:, "two"]
h2.name = "two"
s = Stack(h1, h2)

h.plot() # -> shortcut for h.stack("y").plot()

Thought: Could we above a initial memory copy for stacking an existing histogram? May not be better, due to making copy every access?

self._data = h
self[x] # -> pull x from histogram _OR_ list/dict?

Copy link
Collaborator Author

@LovelyBuggies LovelyBuggies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need some tests for Stack(axes) and Hist.stack(name/idx)

Done.

src/hist/basehist.py Outdated Show resolved Hide resolved

if isinstance(args[0], int):
stack_args = [self.axes[i] for i in args]
return Stack(*stack_args)
Copy link
Member

@henryiii henryiii Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only support one arg, which is the axis to stack over. Multiple args would make a ND stack, which we don't want to support, at least for now.

The argument should be the axis to stack over, which should be expanded. For example:

stack_histograms = self[{ax:i}] for i in range(len(self.axis[ax]))]
return Stack(*stack_histograms)

Though I think we also need to keep track of the bin this came from too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the ax in this example?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my case, it was the integer axis number to expand. Though if it was a string axis name, since that is supported natively in both h[{...}] and h.axis[...], that should work too. I didn't expand the dict correctly above, though.

Copy link
Member

@henryiii henryiii Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the example above - it now no longer "remembers" the cell name, though. Ideally, each histogram would also have at least "h.name" set to the original cell "name". Maybe:

for name, h in zip(self.axes[ax], stack_histograms)
    h.name = name

Copy link
Member

@henryiii henryiii Jul 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If self.axes[i] is a StrCategory, which is most common, then h.name will be the category that this histogram came from. If it's an integer, then name is an integer - slightly weird, but not too bad. If it's a Regular/Variable, then "name" is a tuple with two floats, which is quite weird for something called "name". But I think it's a reasonable tradeoff - having the value is still useful.

@henryiii henryiii force-pushed the nino/feat/stack-hist branch from d5c4c60 to 337648e Compare July 7, 2021 01:21
@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

Quick summary of changes: axes are not stackable, only histograms. Added a little bit of container-like support so stack[0] gets the 0'th histogram in the stack. Added .name which gets set from the category (but currently unused by mplhep or us). Did a little parametrization of the tests (more could be done eventually). I'm reserving overlay= because I forgot what it was supposed to do. I also used stack.py to match Stack, and used self.__class__.__name__ in the repr.

@henryiii henryiii merged commit 5d9edac into main Jul 7, 2021
@henryiii henryiii deleted the nino/feat/stack-hist branch July 7, 2021 01:31
@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

Thanks everyone!

@LovelyBuggies
Copy link
Collaborator Author

LovelyBuggies commented Jul 7, 2021

Added .name which gets set from the category (but currently unused by mplhep or us).

How to use .name? And h.stack() is all forbidden? There's no passed tests for .stack.

@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

I thought there were some. This is what it looks like:

h = hist.Hist.new.Reg(10,0,1, name="x").StrCategory(["good", "bad"], name="quality").Double()
# Fill with data tagged with quality="good" or "bad"
stack = h.stack("quality")
assert stack[0].name == "good"
assert stack[1].name == "bad"

Ideally, calling stack.plot(); plt.legend() would produce a legend that labels these with "good" and "bad", but that doesn't happen yet, might need to be added to mplhep. Unless there's a labels keyword or something, which I didn't check for.

Could you add a test like the one above, actually? I'm working on three other features.

@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

Oops, forgot, StrCat. Allowing both StrCat and StrCategory is allowed on the branch I'm working on - and accidentally passing the quick constructor to the Hist as an axis is allowed, too!

@LovelyBuggies
Copy link
Collaborator Author

FYI,

image

@henryiii
Copy link
Member

henryiii commented Jul 7, 2021

This is why we write tests! 😳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants