Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace TFormula in SimpleJetCorrector #11584

Merged

Conversation

Dr15Jones
Copy link
Contributor

Threaded performance measurements using Intel's VTune showed that
TFormula use was requiring many locks and was causing significant
inefficiencies in CMSSW_7_4 with ROOT 6.02. Although much of the
problem goes away in ROOT 6.04 with the new TFormula implementation,
moving away from TFormula and to the completely thread efficient
reco::FormulaEvaluator also makes construction of the formula
thread efficient.

SimpleJetCorrector places its variables and parameters into arrays
instead of std::vector. Adding an adaptor for arrays allows for
direct uses of the array without having to fill a temporary std::vector.
Threaded performance measurements using Intel's VTune showed that
TFormula use was requiring many locks and was causing significant
inefficiencies in CMSSW_7_4 with ROOT 6.02. Although much of the
problem goes away in ROOT 6.04 with the new TFormula implementation,
moving away from TFormula and to the completely thread efficient
reco::FormulaEvaluator also makes construction of the formula
thread efficient.
@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

A new Pull Request was created by @Dr15Jones (Chris Jones) for CMSSW_7_6_X.

Replace TFormula in SimpleJetCorrector

It involves the following packages:

CommonTools/Utils
CondFormats/JetMETObjects

@cvuosalo, @monttj, @cmsbuild, @slava77, @ggovi, @vadler can you please review it and eventually sign? Thanks.
@TaiSakuma, @jdolen, @makortel, @mmarionncern, @rappoccio, @ahinzmann, @nhanvtran, @apfeiffer1, @schoef, @mariadalfonso this is something you requested to watch as well.
You can sign-off by replying to this message having '+1' in the first line of your reply.
You can reject by replying to this message having '-1' in the first line of your reply.
If you are a L2 or a release manager you can ask for tests by saying 'please test' in the first line of a comment.
@Degano you are the release manager for this.
You can merge this pull request by typing 'merge' in the first line of your comment.

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

-1
Tested at: 83aa621
When I ran the RelVals I found an error in the following worklfows:
135.4 step1

runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step1_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log
----- Begin Fatal Exception 01-Oct-2015 04:30:33 CEST-----------------------
An exception of category 'FormulaEvaluatorParseError' occurred while
   [0] Processing run: 1 lumi: 1 event: 1
   [1] Running path 'HLT_PFJet40_v2'
   [2] Calling event method for module LXXXCorrectorProducer/'hltAK4PFRelativeCorrector'
Exception Message:
While parsing '((x>=[6])*(([0]+([1]/((log10(x)^2)+[2])))+([3]*exp(-([4]*((log10(x)-[5])*(log10(x)-[5])))))))+((x<[6])*[7])' could not parse beyond '((x'
----- End Fatal Exception -------------------------------------------------

1306.0 step2

runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step2_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log
----- Begin Fatal Exception 01-Oct-2015 04:37:51 CEST-----------------------
An exception of category 'FormulaEvaluatorParseError' occurred while
   [0] Processing run: 1 lumi: 1 event: 1
   [1] Running path 'MC_PFMET_v1'
   [2] Calling event method for module LXXXCorrectorProducer/'hltAK4PFRelativeCorrector'
Exception Message:
While parsing '((x>=[6])*(([0]+([1]/((log10(x)^2)+[2])))+([3]*exp(-([4]*((log10(x)-[5])*(log10(x)-[5])))))))+((x<[6])*[7])' could not parse beyond '((x'
----- End Fatal Exception -------------------------------------------------

1330.0 step2

runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15/step2_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15.log
----- Begin Fatal Exception 01-Oct-2015 04:41:00 CEST-----------------------
An exception of category 'FormulaEvaluatorParseError' occurred while
   [0] Processing run: 1 lumi: 1 event: 1
   [1] Running path 'HLT_IsoMu20_eta2p1_CentralPFJet30_BTagCSV07_v2'
   [2] Calling event method for module LXXXCorrectorProducer/'hltAK4PFRelativeCorrector'
Exception Message:
While parsing '((x>=[6])*(([0]+([1]/((log10(x)^2)+[2])))+([3]*exp(-([4]*((log10(x)-[5])*(log10(x)-[5])))))))+((x<[6])*[7])' could not parse beyond '((x'
----- End Fatal Exception -------------------------------------------------

50202.0 step2

runTheMatrix-results/50202.0_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50/step2_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50.log
----- Begin Fatal Exception 01-Oct-2015 04:48:14 CEST-----------------------
An exception of category 'FormulaEvaluatorParseError' occurred while
   [0] Processing run: 1 lumi: 1 event: 1
   [1] Running path 'HLT_PFHT350_PFMET100_NoiseCleaned_v1'
   [2] Calling event method for module LXXXCorrectorProducer/'hltAK4PFRelativeCorrector'
Exception Message:
While parsing '((x>=[6])*(([0]+([1]/((log10(x)^2)+[2])))+([3]*exp(-([4]*((log10(x)-[5])*(log10(x)-[5])))))))+((x<[6])*[7])' could not parse beyond '((x'
----- End Fatal Exception -------------------------------------------------

25202.0 step2

runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25/step2_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25.log
----- Begin Fatal Exception 01-Oct-2015 04:50:02 CEST-----------------------
An exception of category 'FormulaEvaluatorParseError' occurred while
   [0] Processing run: 1 lumi: 1 event: 1
   [1] Running path 'HLT_PFJet40_v2'
   [2] Calling event method for module LXXXCorrectorProducer/'hltAK4PFRelativeCorrector'
Exception Message:
While parsing '((x>=[6])*(([0]+([1]/((log10(x)^2)+[2])))+([3]*exp(-([4]*((log10(x)-[5])*(log10(x)-[5])))))))+((x<[6])*[7])' could not parse beyond '((x'
----- End Fatal Exception -------------------------------------------------

you can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-11584/8534/summary.html

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
9087f9b
5fae603
3106b78
b8ac33e
You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-11584/8534/git-log-recent-commits
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-11584/8534/git-merge-result

SimpleJetCorrector uses TFormula expressions which include comparison
operators.
@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

Pull request #11584 was updated. @cvuosalo, @monttj, @cmsbuild, @slava77, @ggovi, @vadler can you please check and sign again.

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 1, 2015

@slava77
Copy link
Contributor

slava77 commented Oct 6, 2015

Martin,
which test is it?

    --slava

On 10/6/15 12:26 PM, Martin Grunewald wrote:

Hmm, I get a crash now in 76X which I suspect is due to this PR:

|Begin processing the 1st record. Run 191718, Event 128717928,
LumiSection 109 a\ t 06-Oct-2015 08:32:40.213 CEST ----- Begin Fatal
Exception 06-Oct-2015 08:32:58 CEST----------------------- An exception
of category 'FormulaEvaluatorParseError' occurred while [0] Processing
run: 191718 lumi: 109 event: 128717928 [1] Running path
'reconstruction_step' [2] Calling event method for module
LXXXCorrectorProducer/'ak4PFCHSResidualC\ orrector' Exception Message:
While parsing
'[2]([3]+[4]TMath::Log(max([0],min([1],x))))1./([5]+[6]100./3
.
(TMath::Max(0.,1.03091-0.051154_pow(x,-0.154227))-TMath::Max(0.,1.03091-0.051
154_TMath::Power(208.,-0.154227)))+[7]
((-2.36997+0.413917_TMath::Log(x))/x-(-2
.36997+0.413917_TMath::Log(208))/208))' could not parse beyond
'[2]
([3]+[4]
' ----- End Fatal Exception
------------------------------------------------- Another exception was
caught while trying to clean up files after the primary f\ atal exception. |

Any idea?


Reply to this email directly or view it on GitHub
#11584 (comment).

@Dr15Jones
Copy link
Contributor Author

This change would fail this function since 'TMath::Power' is not a known function. Easy to add though.

@Martin-Grunewald
Copy link
Contributor

A std addON test (which works in plain 76X IB), but with some added PRs on top of a 76X IB and then it fails. I need to find out which PR it is; probably #11609 (are these formulas DB objects?)

@Martin-Grunewald
Copy link
Contributor

The formula uses both *pow(x,-0.154227) and *TMath::Power(208.,-0.154227)
is there a mathematical difference under the hood?

@Dr15Jones
Copy link
Contributor Author

I've got a unit test that fails now for that specific formula and I'm working on modifying the FormulaEvaluator to handle this expression.

@slava77
Copy link
Contributor

slava77 commented Oct 6, 2015

On 10/6/15 1:38 PM, Martin Grunewald wrote:

A std addON test (which works in plain 76X IB), but with some added PRs
on top of a 76X IB and then it fails. I need to find out which PR it is;
probably #11609 #11609 (are these
formulas DB objects?)

Yes, the formulas for jet corrections are from DB.
Makes sense then why we missed them.

    --slava


Reply to this email directly or view it on GitHub
#11584 (comment).

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

@slava77 @Martin-Grunewald I am a bit confused.
It this failing only because of the changes in the JEC included in #11609 (i.e. it is a payload issue) or the failures were not spotted before?

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

mh, maybe I can find the answer by myself by dumping the JEC payload for ak4PFCHS

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

There it is:

$ conddb dump 5cf35077b6e3fa06ed3980fea1be4cb239c07633 > 5cf35077b6e3fa06ed3980fea1be4cb239c07633.xml
$ less 5cf35077b6e3fa06ed3980fea1be4cb239c07633.xml
...
<mFormula>[2]*([3]+[4]*TMath::Log(max([0],min([1],x))))*1./([5]+[6]*100./3.*(TMath::Max(0.,1.03091-0.051154*pow(x,-0.154227))-TMath::Max(0.,1.03091-0.051154*TMath::Power(208.,-0.154227)))+[7]*((-2.36997+0.413917*TMath::Log(x))/x-(-2.36997+0.413917*TMath::Log(208))/208))</mFormula>

where mFormula is a string representing a TFormula expression in the JetCorrectorParameters class.
https://github.com/cms-sw/cmssw/blob/CMSSW_7_6_X/CondFormats/JetMETObjects/interface/JetCorrectorParameters.h#L47

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

And that's why you did not get it before: the payload consumed by the tests run by @Martin-Grunewald are not containing TMath::Power.

$ conddb dump a17e756380da6bc973761b00fded7f4968d80b35 > a17e756380da6bc973761b00fded7f4968d80b35.xml
$ less a17e756380da6bc973761b00fded7f4968d80b35.xml | grep Power

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

correction to myself: it would have failed anyhow:

$ less a17e756380da6bc973761b00fded7f4968d80b35.xml | grep TMath
                    <mFormula>[0]*([1]+[2]*TMath::Log(x))</mFormula>

and TMath::Log is not supported either...

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

@mmusich this is something you might want to watch as well

@Martin-Grunewald
Copy link
Contributor

Yeah, re-running the jenkins test in #11609 now shows the same errors...

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

@Dr15Jones what should we do? I see two options:

  1. we ask JetMet to regenerate the payloads removing TMath.
  2. you fix this PR by supporting TMath functions.

I'd go for 1., but maybe 2. is quicker...

@diguida
Copy link
Contributor

diguida commented Oct 6, 2015

comment to myself: TMath::Log is supported as per unit test in https://github.com/cms-sw/cmssw/pull/11584/files#diff-e5f314b2a32168a31a9b3a395f8d490cR436

@Dr15Jones
Copy link
Contributor Author

I found another bug in the parser which I'm tracking down. It would help if someone could dump all of the formulas from the database so I can add them as unit tests.

@diguida
Copy link
Contributor

diguida commented Oct 7, 2015

@Dr15Jones I can dump the payloads currently in the DB of type JetCorrectorParameters. Will take a while...

@slava77
Copy link
Contributor

slava77 commented Oct 7, 2015

@davidlange6 I think we should not make a pre7 without a fix to the problem.
Currently, the run2 matrix workflows are broken in addition to the addon tests mentioned in this thread.

Chris, is the fix coming some time today?

@davidlange6
Copy link
Contributor

Right - we are in any case waiting for a GT with the right PF calibrations

On Oct 7, 2015, at 7:45 PM, Slava Krutelyov [email protected] wrote:

@davidlange6 I think we should not make a pre7 without a fix to the problem.
Currently, the run2 matrix workflows are broken in addition to the addon tests mentioned in this thread.

Chris, is the fix coming some time today?


Reply to this email directly or view it on GitHub.

@Dr15Jones
Copy link
Contributor Author

I hope to have a fix today

slava77 pushed a commit to slava77/cmssw that referenced this pull request Oct 7, 2015
…lInSimpleJetCorrector"

This reverts commit d302bd9, reversing
changes made to fd763e1.
slava77 pushed a commit to slava77/cmssw that referenced this pull request Oct 7, 2015
…lInSimpleJetCorrector"

This reverts commit d302bd9, reversing
changes made to fd763e1.
slava77 pushed a commit to slava77/cmssw that referenced this pull request Oct 7, 2015
…lInSimpleJetCorrector"

This reverts commit d302bd9, reversing
changes made to fd763e1.
@Dr15Jones
Copy link
Contributor Author

#11684 has the fix

@Dr15Jones Dr15Jones deleted the replaceTFormualInSimpleJetCorrector branch October 19, 2015 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants