This repository contains the recommended scale factors (SFs) for several tau discriminators, and tools to read them. More detailed recommendations can be found on this TWiki page: https://twiki.cern.ch/twiki/bin/viewauth/CMS/TauIDRecommendationForRun2
⚠️ Please note that in the near future the SFs in the format of ROOT files will be phased out, and in the long term superseded by thecorrectionlib
tool and JSON files provided centrally by the XPOG. More detailed instructions for tau corrections are here.
To install the tool for reading the tau ID SFs, do
export SCRAM_ARCH=slc6_amd64_gcc700 # for CMSSW_10_3_3, check "scram list"
CMSSW_BASE=CMSSW_10_3_3 # or whichever release you desire
cmsrel $CMSSW_BASE
cd $CMSSW_BASE/src
git clone https://github.com/cms-tau-pog/TauIDSFs TauPOG/TauIDSFs
cmsenv
scram b -j8
After compiling with this respective directory hierarchy, you can acces the tool
(python/TauIDSFTool.py
) in python as
from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
A test of the tool in python can be run with
./test/testTauIDSFTool.py
A similar C++ implementation is available in src/TauIDSFTool.cc
,
with a simple example of usage in (test/testTauIDSFTool.cc
).
This is also an installation test that can be compiled and run with
scram b runtests -j8
Alternatively, if you want to use the python tool standalone without CMSSW,
clone the repository and assure that your PYTHONPATH
points to the TauIDSFTool
module.
export PYTHONPATH=<path to python directory>:$PYTHONPATH
Afterwards, you should be able to do:
from TauIDSFTool import TauIDSFTool
This is a rough summary of the available SFs for DeepTau2017v2p1
and DeepTau2018v2p5
in data/
:
Tau component | genmatch |
DeepTau2017v2p1 VSjet |
DeepTau2017v2p1 VSe |
DeepTau2017v2p1 VSmu |
DeepTau2018v2p5 VSjet |
energy scale |
---|---|---|---|---|---|---|
real tau | 5 |
vs pT and DM (for MC) or vs. pT, or vs. DM (for Embed.) | – (*) | – (*) | vs pT and DM (for MC), no Embed. corrections derived yet | vs. DM |
e -> tau fake | 1 , 3 |
– | vs. eta | – | vs. DM and eta | |
mu -> tau fake | 2 , 4 |
– | – | vs. eta | – (±1% unc.) |
(*) The scale factors are provided only for a sub-set of the working points. For the VSele discriminator, they are measured for the VVLoose and Tight WPs - users are strongly encoraged to use one of these two working points and should report to the TauPOG for approval if another working point is used. For the VSmu, they are measured for the Tight WP but we don't expect a large dependence on the chosen VSmu WP in this case so you are free to use any available WP you like for the muon rejection.
The gen-matching is defined as:
1
for prompt electrons2
for prompt muons3
for electrons from tau decay4
for muons from tau decay5
for real taus6
for no match, or jets faking taus. For more info on gen-matching of taus, please see here. Note that in nanoAOD this is available asTau_GenPartFlav
, but jet or no match correspond toTau_GenPartFlav==0
instead of6
.
The SFs are meant for the following campaigns:
Year label | MC campaign | Data campaign |
---|---|---|
2016Legacy (*) |
RunIISummer16MiniAODv3 |
17Jul2018 |
2017ReReco (*) |
RunIIFall17MiniAODv2 |
31Mar2018 |
2018ReReco (*) |
RunIIAutumn18MiniAOD |
17Sep2018 /22Jan2019 |
UL2016_preVFP |
RunIISummer20UL16*APV |
(HIPM_)UL2016_MiniAODv* |
UL2016_postVFP |
RunIISummer20UL16 |
UL2016_MiniAODv* |
UL2017 |
RunIISummer20UL17 |
UL2017_MiniAODv* |
UL2018 |
RunIISummer20UL18 |
UL2018_MiniAODv* |
(*) The SFs provided for pre-UL samples follow the old conventions for the binning by either pT or DM, and follow the old uncertainty scheme where only total uncertainties are reported
A simple script is given to dump the corrections saved in histograms or functions of the files in data/
. Use for example
./test/dumpTauIDSFs.py data/TauID_SF_*_DeepTau2017v2p1VSjet_*.root
The DM and pT dependent SFs are provided as TF1 functions in the "TauID_SF_dm_DeepTau2017v2p1VSjet_VSjetX_VSeleY_Mar07.root" ROOT files for DeepTau2017v2p1 and "TauID_SF_dm_DeepTau2018v2p5VSjet_VSjetX_VSeleY_Jul18.root" for DeepTau2018v2p5, where X corresponds to the VSjet WP and Y corresponds to the VSele WP.
The ROOT files contain several functions. The central values are obtained from the functions named like "DM$DM_$ERA_fit" where $DM is the decay mode = 0, 1, 10, or 11, and $ERA = 2016_preVFP, 2016_postVFP, 2017, or 2018.
For example to obtain the central value of the SFs for the Medium VSjet and VVLoose VSele WPs of the 'DeepTau2017v2p1VSjet'
discriminator for DM=1 in 2018, use
file = TFile("data/TauID_SF_dm_DeepTau2017v2p1VSjet_VSjetMedium_VSeleVVLoose_Mar07.root")
func = file.Get('DM1_2018_fit')
sf = func.Eval(pt)
There are also functions that correspond to systematic variations that can be accessed in the same way. The table below gives a summary of the function names and what uncertainties they correspond to for DeepTau2017v2p1:
Uncertainty | Function name in ROOT files | String to pass to the tool | Notes | Correlated by era | Correlated by DM |
---|---|---|---|---|---|
Statistical uncertainty 1 |
DM$DM_$ERA_fit_uncert0_{up,down} |
uncert0_{up,down} |
Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix. |
✗ | ✗ |
Statistical uncertainty 2 |
DM$DM_$ERA_fit_uncert1_{up,down} |
uncert1_{up,down} |
Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix. |
✗ | ✗ |
Systematic alleras |
DM$DM_$ERA_syst_alleras_{up,down}_fit |
syst_alleras_{up,down} |
The component of the systematic uncertainty that is correlated across DMs and eras |
✓ | ✓ |
Systematic by-era |
DM$DM_$ERA_syst_$ERA_{up,down}_fit |
syst_$ERA_{up,down} |
The component of the systematic uncertainty that is correlated across DMs but uncorrelated by eras |
✗ | ✓ |
Systematic by-era and by-DM |
DM$DM_$ERA_syst_dm$DM_$ERA_{up,down}_fit |
syst_dm$DM_$ERA_{up,down} |
The component of the systematic uncertainty that is uncorrelated across DMs and eras |
✗ | ✗ |
The table below gives a summary of the function names and what uncertainties they correspond to for DeepTau2018v2p5:
Uncertainty | Function name in ROOT files | String to pass to the tool | Notes | Correlated by era | Correlated by DM |
---|---|---|---|---|---|
Statistical uncertainty 1 |
DM$DM_$ERA_fit_uncert0_{up,down} |
uncert0_{up,down} |
Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix. |
✗ | ✗ |
Statistical uncertainty 2 |
DM$DM_$ERA_fit_uncert1_{up,down} |
uncert1_{up,down} |
Statistical uncertainty on linear fit parameters from eigendecomposition of covariance matrix. |
✗ | ✗ |
Systematic alleras |
DM$DM_$ERA_syst_alleras_{up,down}_fit |
syst_alleras_{up,down} |
The component of the systematic uncertainty that is correlated across DMs and eras |
✓ | ✓ |
Systematic by-era |
DM$DM_$ERA_syst_alldms_$ERA_{up,down}_fit |
syst_alldms_$ERA_{up,down} |
The component of the systematic uncertainty that is correlated across DMs but uncorrelated by eras |
✗ | ✗ |
Systematic Tau Energy scale |
DM$DM_$ERA_TES{Up,Down}_fit |
TES_{up,down} |
The uncertainty due to the tauenergy scale systematic uncertainty |
✗ | ✗ |
The SFs can also be accessed using the tool:
from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool(year='UL2018',id='DeepTau2017v2p1VSjet',wp='Medium',wp_vsele='VVLoose',ptdm=True)
sf = tauSFTool.getSFvsDMandPT(pt,dm,genmatch)
And uncertainty variations can be accessed using:
sf = tauSFTool.getSFvsDMandPT(pt,dm,genmatch,unc)
where the unc
string is used to identify the systematic variation as given in the third column in the above table
Analyses that are sensitive to taus with pT>140 GeV should switch to the dedicated high pT SFs measured in bins of pT above 140 GeV
The SFs are provided as TGraphAsymmErrors objects in the "TauID_SF_Highpt_DeepTau2017v2p1VSjet_VSjetX_VSeleY_Mar07.root" ROOT files, where X corresponds to the VSjet WP and Y corresponds to the VSele WP.
The ROOT files contain several graphs. The central values are obtained from the graphs named like "DMinclusive_$ERA" where $ERA = 2016_preVFP, 2016_postVFP, 2017, or 2018. These graphs contain 2 pT bins with pT 100-200, and pT>200 GeV. You should only use these as binned values. For taus between 140-200 GeV use the first bin, and for taus with pT>200 GeV use the second bin.
The SFs can also be accessed using the tool:
from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool(year='UL2018',id='DeepTau2017v2p1VSjet',wp='Medium',wp_vsele='VVLoose',highpT=True)
sf = tauSFTool.getHighPTSFvsPT(pt,genmatch)
And uncertainty variations can be accessed using:
sf = tauSFTool.getHighPTSFvsPT(pt,genmatch,unc)
where "unc" dependends on the uncertainty source. The table below describes the uncertainty sources and the string you need to pass to the tool to retrieve them:
Uncertainty | String to pass to the tool | Notes | Correlated by era | Correlated by pT |
---|---|---|---|---|
Statistical uncertainty 1 |
stat_bin1_{up,down} |
Statistical uncertainty on the pT 140-200 GeV bin. Note this also includes systematic uncertainties that are decorrelated by pT bin and era (since they also behave like statistical uncertainties) |
✗ | ✗ |
Statistical uncertainty 2 |
stat_bin2_{up,down} |
Statistical uncertainty on the pT >200 GeV bin. Note this also includes systematic uncertainties that are decorrelated by pT bin and era (since they also behave like statistical uncertainties) |
✗ | ✗ |
Systematic |
syst_{up,down} |
The systematic uncertainty that is correlated across pT regions and eras |
✓ | ✓ |
Extrapolation Systematic |
extrap_{up,down} |
The systematics uncertainty due to the extrapolation of the SF to higher pT regions |
✓ | ✓ |
Deprecated for UL MC - use DM and pT-dependent SFs instead!
The embedded scale factors still follow the old prescriptions for pT or DM binned SFs so these instructions still apply in this case
The pT-dependent SFs are provided as TF1
functions. For example, to obtain those for the medium WP of the 'DeepTau2017v2p1VSjet'
discriminator for 2016, use
file = TFile("data/TauID_SF_pt_DeepTau2017v2p1VSjet_2016Legacy.root")
func = file.Get('Medium_cent')
sf = func.Eval(pt)
The tool can be used as
from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool('2016Legacy','DeepTau2017v2p1VSjet','Medium',ptdm=False)
and to retrieve the SF for a given tau pT, do
sf = tauSFTool.getSFvsPT(pt)
The SF should only be applied to tau objects that match "real" taus at gen-level (genmatch==5
).
You can pass the optional genmatch
argument and the function will return the appropriate SF if genmatch==5
, and 1.0
otherwise,
sf = tauSFTool.getSFvsPT(pt,genmatch)
The recommended uncertainties can be retrieved as
sf_up = tauSFTool.getSFvsPT(pt,genmatch,unc='Up')
sf_down = tauSFTool.getSFvsPT(pt,genmatch,unc='Down')
or, all three in one go:
sf_down, sf, sf_up = tauSFTool.getSFvsPT(pt,genmatch,unc='All')
For the tau ID SF of the embedded samples, set the emb
flag to True
:
tauSFTool = TauIDSFTool('2017ReReco','DeepTau2017v2p1VSjet','Medium',emb=True)
If your analysis uses a DeepTauVSe WP looser than VLoose and/or DeepTauVSmu looser than medium discriminators,
should add additional uncertainty using the otherVSlepWP
flag:
tauSFTool = TauIDSFTool('2017ReReco','DeepTau2017v2p1VSjet','Medium',otherVSlepWP=True)
Deprecated for UL MC - use DM and pT-dependent SFs instead!
The embedded scale factors still follow the old prescriptions for pT or DM binned SFs so these instructions still apply in this case
Analyses using ditau triggers and tau pT > 40 GeV, may use DM-dependent SFs.
Please note that no SFs are available for decay modes 5 and 6, and the tool will return 1 by default, please read this
TWiki section.
They are provided as TH1
histograms. For example, to obtain those for the medium WP of the 'DeepTau2017v2p1VSjet'
discriminator for 2016, use
file = TFile("data/TauID_SF_dm_DeepTau2017v2p1VSjet_2016Legacy.root")
hist = file.Get('Medium')
sf = hist.GetBinContent(hist.GetXaxis().FindBin(dm))
or with the tool,
from TauPOG.TauIDSFs.TauIDSFTool import TauIDSFTool
tauSFTool = TauIDSFTool('2017ReReco','MVAoldDM2017v2','Tight',dm=True,ptdm=False)
sf = tauSFTool.getSFvsDM(pt,dm,genmatch)
sf_up = tauSFTool.getSFvsDM(pt,dm,genmatch,unc='Up')
sf_down = tauSFTool.getSFvsDM(pt,dm,genmatch,unc='Down')
where genmatch
is optional.
To apply SFs to electrons or muons faking taus, use the eta of the reconstructed tau and the genmatch
code.
They are provided as TH1
histograms:
file = TFile("data/TauID_SF_eta_DeepTau2017v2p1VSmu_2016Legacy.root")
hist = file.Get('Medium')
sf = hist.GetBinContent(hist.GetXaxis().FindBin(eta))
or with the tool,
python/TauIDSFTool.py
antiEleSFTool = TauIDSFTool('2017ReReco','antiEleMVA6','Loose')
antiMuSFTool = TauIDSFTool('2017ReReco','antiMu3','Tight')
antiEleSF = antiEleSFTool.getSFvsEta(eta,genmatch)
antiMuSF = antiMuSFTool.getSFvsEta(eta,genmatch)
The uncertainty is obtained in a similar way as above.
Usage for DeepTau2018v2p5
The tau energy scale (TES) corrections for taus with pT<140 GeV are provided in the files data/TauES_dm_DeepTau2018v2p5VSjet_$ERA_VSjet$X_VSele$Y_Jul18.root
, where $X corresponds to the VSjet WP, $Y corresponds to the VSele WP, and $ERA = UL2016_preVFP, UL2016_postVFP, UL2017, or UL2018
Each file contains one histogram ('tes'
) with the TES centered around 1.0
measured in bins of the tau decay mode.
It should be applied to a genuine tau by multiplying the tau TLorentzVector
, or equivalently, the tau energy, pT and mass as follows:
file = TFile("data/TauES_dm_DeepTau2018v2p5VSjet_UL2018_VSjetMedium_VSeleVVLoose_Jul18.root")
hist = file.Get('tes')
tes = hist.GetBinContent(hist.GetXaxis().FindBin(dm))
# scale the tau's TLorentzVector
tau_tlv *= tes
# OR, scale the energy, mass and pT
tau_E *= tes
tau_pt *= tes
tau_m *= tes
The uncertainties are equal to 1.5% for decay modes 0, 1, and 10, and 2% for decay mode 11. The uncertainties should be decorrelated by decay modes and eras.
For taus with pT>140 GeV, no corrections should be applied to the nominal TES value from MC but a larger 3% uncertainty should be included.
A simple class, TauESTool
, is provided to obtain the TES as
from TauPOG.TauIDSFs.TauIDSFTool import TauESTool
testool = TauESTool('UL2018','DeepTau2018v2p5VSjet',wp='Medium', wp_vsele='VVLoose')
tes = testool.getTES(pt,dm,genmatch)
tesUp = testool.getTES(pt,dm,genmatch,unc='Up')
tesDown = testool.getTES(pt,dm,genmatch,unc='Down')
This method computes the central values and uncertainty for low pT (20 GeV < pT < 140 GeV) and higher pT values (pT > 140 GeV).
Usage for DeepTau2017v2p1
The tau energy scale (TES) is provided in the files data/TauES_dm_*.root
.
Each file contains one histogram ('tes'
) with the TES centered around 1.0
.
It should be applied to a genuine tau by multiplying the tau TLorentzVector
, or equivalently, the tau energy, pT and mass as follows:
file = TFile("data/TauES_dm_DeepTau2017v2p1VSjet_UL2018.root")
hist = file.Get('tes')
tes = hist.GetBinContent(hist.GetXaxis().FindBin(dm))
# scale the tau's TLorentzVector
tau_tlv *= tes
# OR, scale the energy, mass and pT
tau_E *= tes
tau_pt *= tes
tau_m *= tes
A simple class, TauESTool
, is provided to obtain the TES as
from TauPOG.TauIDSFs.TauIDSFTool import TauESTool
testool = TauESTool('2017ReReco','DeepTau2017v2p1VSjet')
tes = testool.getTES(pt,dm,genmatch)
tesUp = testool.getTES(pt,dm,genmatch,unc='Up')
tesDown = testool.getTES(pt,dm,genmatch,unc='Down')
This method computes the right uncertainty at intermediate (34 GeV < pT < 170 GeV) and higher pT values (pT > 170 GeV). Analyses that only want to use the TES at high pT, can use the following instead:
tes = testool.getTES_highpt(dm,genmatch)
The e -> tau fake energy scale (FES) is provided in the files data/TauFES_eta-dm_*.root
.
Each file contains one graph ('fes'
) with the FES centered around 1.0
.
It should only be applied to reconstructed taus that are faked by electrons (i.e. genmatch==1
or 3
) and have DM 0 or 1.
The application is the similar as for the TES above.
A simple class, TauFESTool
, is provided to obtain the FES as
from TauPOG.TauIDSFs.TauIDSFTool import TauFESTool
festool = TauESTool('2017ReReco')
fes = festool.getFES(eta,dm,genmatch)
fesUp = festool.getFES(eta,dm,genmatch,unc='Up')
fesDown = festool.getFES(eta,dm,genmatch,unc='Down')