-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First implementation of an nT-like likelihood #32
Conversation
Pull Request Test Coverage Report for Build 5534394288
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remaining comments which cannot be posted as a review comment to avoid GitHub Rate Limit
pep8
alea/blueice_extended_model.py|142 col 1| S101 Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
alea/blueice_extended_model.py|151 col 1| D102 Missing docstring in public method
alea/blueice_extended_model.py|154 col 1| D102 Missing docstring in public method
alea/blueice_extended_model.py|157 col 1| S101 Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
alea/blueice_extended_model.py|177 col 1| D102 Missing docstring in public method
alea/blueice_extended_model.py|178 col 9| WPS221 Found line with high Jones Complexity: 17 > 14
alea/utils.py|266 col 1| D103 Missing docstring in public function
alea/utils.py|744 col 1| D103 Missing docstring in public function
alea/utils.py|750 col 1| S307 Use of possibly insecure function - consider using safer ast.literal_eval.
alea/statistical_model.py|152 col 1| D102 Missing docstring in public method
alea/statistical_model.py|168 col 22| E231 missing whitespace after ':'
alea/statistical_model.py|168 col 30| E231 missing whitespace after ','
alea/parameters.py|218 col 1| D200 One-line docstring should fit on one line with quotes
alea/parameters.py|221 col 9| WPS221 Found line with high Jones Complexity: 16 > 14
alea/parameters.py|229 col 9| WPS221 Found line with high Jones Complexity: 16 > 14
Pull Request Test Coverage Report for Build 5572804010
💛 - Coveralls |
Tried shortly :)! I hit some path issues (it could not find the templates) that lead me to this: This will then fail unless you're running it in the base alea folder. |
The basic functionality of the blueice-based likelihood is now fully implemented (and now also cleaned up). There are a few open points that I marked with TODO or IDEA that I would implement in a dedicated PR since I don't think they are crucial for now and this way we can continue implementing things collaboratively. Below you can find an example code snippet on how to run the likelihood and what one can do with it. from alea.blueice_extended_model import BlueiceExtendedModel
import numpy as np
import scipy.stats as sps
import matplotlib.pyplot as plt
import os
import alea
from tqdm import tqdm
alea_dir = os.path.dirname(alea.__file__)
config_path = os.path.join(alea_dir, "examples/unbinned_wimp_statistical_model.yaml")
# Initialize the statistical model
statistical_model = BlueiceExtendedModel.from_config(config_path)
##### 1) Simple example #####
# Generate and assign some data
data = statistical_model.generate_data()
statistical_model.data = data
# Fit the model
best_result, best_ll = statistical_model.fit(verbose=True)
# Plot the data
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
for i, ax in enumerate(axes):
for source in range(2):
source_selection = data[i]["source"] == source
source_name = statistical_model._likelihood.likelihood_list[i].source_name_list[source]
ax.plot(data[i]["cs1"][source_selection],
data[i]["cs2"][source_selection],
"o", label=source_name)
ax.legend()
ax.semilogy()
ax.set_xlabel("cs1 [PE]")
ax.set_ylabel("cs2 [PE]")
ax.set_ylim(1e2, 1e4)
ax.set_title(f"{statistical_model.likelihood_names[i]}")
plt.show()
##### 2) Compute the confidence interval #####
dl, ul = statistical_model.confidence_interval("wimp_rate_multiplier",
parameter_interval_bounds=[0, 50],
)
# Check the result
rate_vals = np.linspace(max(0, dl - .5), ul + .5, 200)
lls = []
for rate in rate_vals:
res, ll = statistical_model.fit(wimp_rate_multiplier=rate)
lls.append(ll)
lls = np.array(lls)
plt.plot(rate_vals, 2 * (best_ll - lls))
plt.axhline(sps.chi2(1).ppf(.9), c="orange", ls="--")
plt.axvline(best_result["wimp_rate_multiplier"], c="r")
plt.axvline(dl, c="orange")
plt.axvline(ul, c="orange")
plt.ylim(0, 4)
plt.xlim(0)
plt.xlabel("wimp_rate_multiplier")
plt.ylabel("-2 * LLR")
plt.show()
##### 3) Compute median upper limit distribution for b-only toys #####
n_mc = 200
dls = []
uls = []
for i in tqdm(range(n_mc), desc="Computing confidence intervals"):
# generate background only data
data = statistical_model.generate_data(wimp_rate_multiplier=0)
statistical_model.data = data
dl, ul = statistical_model.confidence_interval(
"wimp_rate_multiplier",
parameter_interval_bounds=[0, 50]
)
dls.append(dl)
uls.append(ul)
dls = np.array(dls)
uls = np.array(uls)
fig, axes = plt.subplots(1, 2, figsize=(10, 5))
dls[dls == -np.inf] = 0
axes[0].hist(dls, bins=20)
axes[0].set_xlabel("dl [wimp_rate_multiplier]")
axes[1].hist(uls, bins=20)
axes[1].set_xlabel("ul [wimp_rate_multiplier]")
axes[1].axvline(np.median(uls), c="crimson")
for ax in axes:
ax.semilogy()
ax.set_xlim(0) |
eval_analysis_space.append(eval_element) | ||
return eval_analysis_space | ||
|
||
def find_file_resource(file: str, possible_locations: list) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
E302 expected 2 blank lines, found 1
|
||
def find_file_resource(file: str, possible_locations: list) -> str: | ||
""" | ||
Function to look for a file in a list of folders (the file name may include folder structure too). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
E501 line too long (102 > 100 characters)
fname = Path(loc, file) | ||
if fname.is_file(): | ||
return str(fname) | ||
raise FileNotFoundError("{:s} not found in any of {:s}".format(file, ",".join(possible_locations))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
E501 line too long (103 > 100 characters)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runs sensibly, code sensible, thanks!
if isinstance(likelihood_config["template_folder"], str): | ||
template_folder_list = [likelihood_config["template_folder"]] | ||
elif isinstance(likelihood_config["template_folder"], list): | ||
template_folder_list = likelihood_config["template_folder"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about adding the alea example data folder as a default search folder here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you mean always just add "alea/examples/templates/wimp_templates" to the end of the list? It probably doesn't hurt adding it, though I'm not sure how often this will be used (for the examples I think it is good to write things explicitly so that people know what to change when implementing their own model).
… efficiency parameters. This way, blueice will not cache pdfs for different values of these parameters
Removes all hash for parameters not used for each source, and for all…
ret = dict() | ||
for ll in self._likelihood.likelihood_list[:-1]: # ancillary likelihood does not contribute | ||
|
||
ll_pars = list(ll.rate_parameters.keys()) + list(ll.shape_parameters.keys()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
WPS221 Found line with high Jones Complexity: 15 > 14
|
||
# add all parameters to extra_dont_hash for each source unless it is used: | ||
for i, source in enumerate(config["sources"]): | ||
parameters_to_ignore: List[str] = [p.name for p in self.parameters if (p.ptype == "shape") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
WPS441 Found control variable used after block: p
|
||
# add all parameters to extra_dont_hash for each source unless it is used: | ||
for i, source in enumerate(config["sources"]): | ||
parameters_to_ignore: List[str] = [p.name for p in self.parameters if (p.ptype == "shape") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
WPS465 Found likely bitwise and boolean operation mixup
|
||
# add all parameters to extra_dont_hash for each source unless it is used: | ||
for i, source in enumerate(config["sources"]): | ||
parameters_to_ignore: List[str] = [p.name for p in self.parameters if (p.ptype == "shape") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
WPS441 Found control variable used after block: p
|
||
# add all parameters to extra_dont_hash for each source unless it is used: | ||
for i, source in enumerate(config["sources"]): | ||
parameters_to_ignore: List[str] = [p.name for p in self.parameters if (p.ptype == "shape") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
E501 line too long (106 > 100 characters)
assert efficiency_parameter.ptype == "efficiency", "The parameter {:s} must" \ | ||
" be an efficiency".format(efficiency_name) | ||
limits = efficiency_parameter.fit_limits | ||
assert 0 <= limits[0], 'Efficiency parameters including {:s} must be' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
N400: Found backslash that is used for line breaking
assert efficiency_parameter.ptype == "efficiency", "The parameter {:s} must" \ | ||
" be an efficiency".format(efficiency_name) | ||
limits = efficiency_parameter.fit_limits | ||
assert 0 <= limits[0], 'Efficiency parameters including {:s} must be' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
WPS309 Found reversed compare order
limits = efficiency_parameter.fit_limits | ||
assert 0 <= limits[0], 'Efficiency parameters including {:s} must be' \ | ||
' constrained to be nonnegative'.format(efficiency_name) | ||
assert np.isfinite(limits[1]), 'Efficiency parameters including {:s} must be' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
S101 Use of assert detected. The enclosed code will be removed when compiling to optimised byte code.
limits = efficiency_parameter.fit_limits | ||
assert 0 <= limits[0], 'Efficiency parameters including {:s} must be' \ | ||
' constrained to be nonnegative'.format(efficiency_name) | ||
assert np.isfinite(limits[1]), 'Efficiency parameters including {:s} must be' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
N400: Found backslash that is used for line breaking
@@ -150,39 +163,33 @@ def store_data( | |||
kw = {'metadata': metadata} if metadata is not None else dict() | |||
toydata_to_file(file_name, data_list, data_name_list, **kw) | |||
|
|||
def get_expectations(self): | |||
return NotImplementedError("get_expectation is optional to implement") | |||
def get_expectation_values(self, **parameter_values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[pep8] reported by reviewdog 🐶
D102 Missing docstring in public method
This PR adds the nT-like blueice-based likelihood structure to alea (solves #23 ) and also introduces a suggestion for split configs (#15)
The model seems to work for my first basic tests. I'll test it again with the changes on the current master and then add some docstrings, clean up, and some examples for how to use it 😊 But of course, if you can already spot something that you would suggest doing differently: Let me know! 😄