Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/samples summary in result #981

Merged
merged 41 commits into from
Apr 8, 2024
Merged

Conversation

Jammy2211
Copy link
Collaborator

@Jammy2211 Jammy2211 commented Mar 22, 2024

Make it so that all internal results loading use the SamplesSummary object instead of the Samples object wherever possible.

This has two benefits:

-Loading samples.csv / search_internal.dill in order to load results when resuming a model-fit was very slow.

  • Recent updates to output.yaml mean autofit can now run without outputting a samples.csv file or search_internal.dill file at all (to save hard disk space). The samples summary file cannot be customized to not be output, meaning the information for resuming runs is always available.

This is motivated for search chaining prior passing, which by using the SamplesSummary resumes previous fits much faster.

@Jammy2211 Jammy2211 requested a review from rhayes777 March 22, 2024 16:53
@Jammy2211 Jammy2211 mentioned this pull request Mar 22, 2024
Comment on lines +54 to +60
median_pdf_sample = Sample.from_lists(
model=self.model,
parameter_lists=[self.median_pdf(as_instance=False)],
log_likelihood_list=[self.max_log_likelihood_sample.log_likelihood],
log_prior_list=[self.max_log_likelihood_sample.log_prior],
weight_list=[self.max_log_likelihood_sample.weight],
)[0]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems silly to use a function called 'from_lists' only to take the first item

@rhayes777
Copy link
Owner

I guess it makes a lot of sense for Result to take Paths; it starts to look a lot like the SearchOutput class used by the agregator.

What happens when a search has no name?

Comment on lines +129 to +131
# sample_list=samples_with_log_likelihood_list(
# self.sample_multiplier * fit, _make_samples(model)
# ),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented code

Comment on lines +286 to +287
if self._samples is not None:
return self._samples
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use cached_property

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, this is because samples can be passed into __init__ meaning we dont load via csv (e.g. to avoid I/O)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ofc

Comment on lines +289 to +295
try:
Samples.from_csv(
paths=self.paths,
model=self.model,
)
except FileNotFoundError:
pass
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the return type be Optional?

@Jammy2211 Jammy2211 merged commit c540320 into main Apr 8, 2024
0 of 4 checks passed
@rhayes777 rhayes777 deleted the feature/samples_summary_in_result branch April 15, 2024 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants