Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use std::lower_bound to add/search for userFloats/Ints/Cands/KinResolutions (76X) #11574

Merged
merged 6 commits into from
Oct 7, 2015

Conversation

lgray
Copy link
Contributor

@lgray lgray commented Sep 30, 2015

Change PATObject so that it uses std::lower_bound to do the accounting of stored user information.

This should result in a very minor slow down in adding user data, and a huge speed up when retrieving it when there is a large number of user floats.
Also of note, previously if the same user Float/Int was added twice it would simply be appended to the end of the list. When retrieved a search from the front of the list was performed, so you'd only ever get the first value entered.

I've made it so that the original behavior is preserved but a complaint is generated if an overwrite is tried but not allowed (by boolean argument to the addUserFloat/Int() functions).

Also of note, the behavior of userFloat(key) and userInt(key) in the case that key does not exist has changed, both throw an exception if the key is not found.

This is a technical PR, no physics changes expected/observed in short matrix.

@lgray lgray changed the title Use std::lower_bound to add/search for userFloats/Ints/Cands/KinResolutions Use std::lower_bound to add/search for userFloats/Ints/Cands/KinResolutions (76X) Sep 30, 2015
@lgray
Copy link
Contributor Author

lgray commented Sep 30, 2015

@cmsbuild please test

@cmsbuild cmsbuild added this to the Next CMSSW_7_6_X milestone Sep 30, 2015
@cmsbuild
Copy link
Contributor

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @lgray (Lindsey Gray) for CMSSW_7_6_X.

Use std::lower_bound to add/search for userFloats/Ints/Cands/KinResolutions (76X)

It involves the following packages:

DataFormats/PatCandidates

@cmsbuild, @vadler, @monttj can you please review it and eventually sign? Thanks.
@gpetruc this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.
If you are a L2 or a release manager you can ask for tests by saying 'please test' in the first line of a comment.
@Degano you are the release manager for this.
You can merge this pull request by typing 'merge' in the first line of your comment.

@lgray
Copy link
Contributor Author

lgray commented Sep 30, 2015

@slava77 @cvuosalo FYI

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

@monttj @slava77 I think I have a better implementation, I may update this PR in the near future after testing.

@gpetruc I think I have a nice way to remove the linear component of insert operations, so it will retain speed on adding variables as well.

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

@slava77 @monttj @gpetruc Disregard previous message about new impl. It is not faster in the end.

@slava77
Copy link
Contributor

slava77 commented Oct 1, 2015

Do we care much about faster insert?
I think faster reading is more essential.
On Oct 1, 2015 4:24 AM, "Lindsey Gray" [email protected] wrote:

@slava77 https://github.com/slava77 @monttj https://github.com/monttj
@gpetruc https://github.com/gpetruc Disregard previous message about
new impl. It is not faster in the end.


Reply to this email directly or view it on GitHub
#11574 (comment).

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

Fast read is guaranteed by this change. Was just trying to optimize as much as possible the writing too. At worst it is log + linear in the distance to end. Really not bad.

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

Lookup here is always log.

@gpetruc
Copy link
Contributor

gpetruc commented Oct 1, 2015

The only thing I'm not totally convinced is the change of the default for
the userInt and userFloat, since people's code may be using the current
return of zero as an indication of failure. So, if we do this change in 76X
it should be advertised.

Giovanni

On Thu, Oct 1, 2015 at 12:28 PM, Lindsey Gray [email protected]
wrote:

Lookup here is always log.


Reply to this email directly or view it on GitHub
#11574 (comment).

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

@gpetruc thanks for the comment. I think giving wild (but calculable) values is better indicative of failure than returning zero, this way you force most cuts to fail (specifically with the choice of NaN for floats), or if storing indices to arrays in an object you'll likely go out of bounds. This means no quiet failures in analysis code, which I think is helpful.

I agree that the change should be advertised, where should I be advertising this?

This could be back ported into 74X just requiring an iorule to sort the names and stored values. Should I prepare this? (Perhaps 74X version leaves 0's for returned values when no label available.)

@gpetruc
Copy link
Contributor

gpetruc commented Oct 1, 2015

Hi,

Depending on how the thing is used, quiet failures may still happen, e.g.
if people use the userFloats or userInt to cut away objects.
So, if we really want to be safe we should throw on missing entries, maybe
we should directly switch to that behvaiour (after all, there are other
getters to test whether the values are available or not, in case somebody
really mixes filled and un-filled objects in their code)

For 74X, I'd limit backports to fixes or features really needed for
analysis of miniAODs.

Giovanni

On Thu, Oct 1, 2015 at 12:46 PM, Lindsey Gray [email protected]
wrote:

@gpetruc https://github.com/gpetruc thanks for the comment. I think
giving wild values is better indicative of failure than returning zero,
this way you force most cuts to fail (specifically with the choice of NaN
for floats), or if storing indices to arrays in an object you'll likely go
out of bounds. This means no quiet failures in analysis code, which I think
is helpful.

I agree that the change should be advertised, where should I be
advertising this?

This could be back ported into 74X just requiring an iorule to sort the
names and stored values. Should I prepare this? (Perhaps 74X version leaves
0's for returned values when no label available.


Reply to this email directly or view it on GitHub
#11574 (comment).

@lgray
Copy link
Contributor Author

lgray commented Oct 1, 2015

Agreed. In that case I will update the PR so that we throw on missing ints/floats.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 4, 2015

Pull request #11574 was updated. @cmsbuild, @vadler, @monttj can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 4, 2015

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 4, 2015

@lgray
Copy link
Contributor Author

lgray commented Oct 5, 2015

@gpetruc OK with you now?

@monttj Please review this PR!

@lgray
Copy link
Contributor Author

lgray commented Oct 6, 2015

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2015

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2015

Pull request #11574 was updated. @cmsbuild, @vadler, @monttj can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2015

@lgray
Copy link
Contributor Author

lgray commented Oct 6, 2015

@gpetruc any further comments?

@monttj please review!

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 6, 2015

davidlange6 added a commit that referenced this pull request Oct 7, 2015
Use std::lower_bound to add/search for userFloats/Ints/Cands/KinResolutions (76X)
@davidlange6 davidlange6 merged commit 165438d into cms-sw:CMSSW_7_6_X Oct 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants