Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantile Delta Mapping #200

Merged
merged 131 commits into from
Apr 17, 2024
Merged

Quantile Delta Mapping #200

merged 131 commits into from
Apr 17, 2024

Conversation

castelao
Copy link
Member

@castelao castelao commented Mar 21, 2024

Implementing Quantile Delta Mapping correction.

Note that this implementation tried to keep consistent as much as possible with the rest of the library, such as LinearCorrection.

@castelao castelao requested review from grantbuster and bnb32 March 21, 2024 02:15
@castelao castelao self-assigned this Mar 21, 2024
castelao added 24 commits March 28, 2024 14:08
Handle analytical distributions using scipy or empirical ones.
An empty placeholder for now. Let's try a different approach.
Isolating QDM method since bias_calc was already getting too large.
Simplifies get_base_data().
Allows an alternative handler to deal with multiple bias datasets with
the very same get_bias_data().
The QuantileDeltaMapping() requires a third dataset, the biased future,
thus requiring a modified instantiation to receive such dataset.
An MVP of empirical distributions estimate for historical observations,
historical modeled, and future modeled. Runing serial only.
Let's ignore for now the requirement on seasonal estimates.
Following the standard in the library, pre-allocate out holder
(currently a dictionary).
For now, hardcoded to linear only.
Just trying to mimic the linear calibration. It's not clear the indices
used to select and slice the quantiles.

There is a weakness here since the quantiles are estiamted in a previous
step, it lacks some lock to guarantee that the choosed coefficients file
is the correct pair with the data to be corrected.
An interface to local_qdm_bc using np.array .
Keys aspects here is to minimize memory footprint and allow transparent
concurrency.
A different implementation for QDM that mimics as much as possible
LinearCorrection.
@castelao
Copy link
Member Author

@bnb32 , I added your rules (just the ignore for now) for ruff with pyproject.toml. If you have a chance, please double check it.

@castelao castelao merged commit b74d837 into main Apr 17, 2024
8 checks passed
@castelao castelao deleted the Gui/QDM branch April 17, 2024 22:12
github-actions bot pushed a commit that referenced this pull request Apr 17, 2024
* Core distribution classes

Handle analytical distributions using scipy or empirical ones.

* Initiating a new QuantileDeltaMapping

An empty placeholder for now. Let's try a different approach.

* refactor: Reorganizing module

Isolating QDM method since bias_calc was already getting too large.

* feat: Implementing from_fit() for EmpiricalDistribution

* Reducing default empirical quantiles to 20 chunks

* feat: QDM.get_base_data()

Simplifies get_base_data().

* feat: Optional alterntive handler for get_bias_data()

Allows an alternative handler to deal with multiple bias datasets with
the very same get_bias_data().

* doc: Example for EmpiricalDistribution.from_quantiles()

* feat: Custom QuantileDeltaMapping.__init__ to deal with biased future

The QuantileDeltaMapping() requires a third dataset, the biased future,
thus requiring a modified instantiation to receive such dataset.

* fix: Missing imports

* feat: QuantileDeltaMapping.run() to estimate distributions

An MVP of empirical distributions estimate for historical observations,
historical modeled, and future modeled. Runing serial only.

* Renaming NT to NQ (Number of quantiles)

Let's ignore for now the requirement on seasonal estimates.

* Prototype for saving quantiles

* fix: Missing imports

* Temporary solution for number of quantiles

* feat: _init_out()

Following the standard in the library, pre-allocate out holder
(currently a dictionary).

* Renaming output items

* Saving metadata for sampling method

For now, hardcoded to linear only.

* fix: Using 'filename' here

* cleaning: output collector now created at __init__out()

* feat: bias_trasnforms.get_spatial_bc_quantiles()

Just trying to mimic the linear calibration. It's not clear the indices
used to select and slice the quantiles.

There is a weakness here since the quantiles are estiamted in a previous
step, it lacks some lock to guarantee that the choosed coefficients file
is the correct pair with the data to be corrected.

* feat: bias_transforms.get_spatial_bc_quantiles()

* feat: bias_transforms.local_qdm_bc_as_nparray()

An interface to local_qdm_bc using np.array .

* feat: [MVP] bias_transforms.local_qdm_bc()

Keys aspects here is to minimize memory footprint and allow transparent
concurrency.

* feat: bias_calc.QuantileDeltaMappingCorrection

A different implementation for QDM that mimics as much as possible
LinearCorrection.

* test: serial vs parallel

* test: basic run of QuantileDeltaMappingCorrection

* Requirements for testing

* Renaming test file

* clean: Unecessary imports

* style: Arguments alignment

* style:

* test: Save distributions in a valid HDF5

* Avoiding xr.Dataset to conform with library

* Using QuantileDeltaMapping from rex

DRY.

* QDM using rex's implementation

* fix: get_spatial_bc_quantiles() requires base dataset name

* feat: local_qdm_bc based on rex's QDM

* Getting distribution definitions from saved HDF5

* Removing module distribution

By using rex's QDM we don't need to know about distributions here
anymore.

* Removing my QDM

I'm now using rex's QDM, so we don't need this anymore.

* Making QuantileDeltaMappingCorrection available in the lib

* clean: _quantile_delta_mapping() is not used anymore

QDM core calculation moved to use rex.

* feat: Implementing DataHandler.qdm_bc()

Keeping it as close as possible to .lin_bc().

* test: DataHandler.qdm_bc()

* style: Removing unused variables

* feat: Custom distributions

Quantiles configuration is not hardcoded anymore, but defined when
instantiating the class.

* fix: Must load n_quantiles before initializing 'out'

* fix: Left behind an `NQ` variable

* style: Matching the library style

* Setup black to follow the 79 chars

* style: Matching the library style

* doc: QuantileDeltaMappingCorrection()

* test: Refactoring common test dataset

* fix: typo

* doc: Extending documentation for __init_out__

* doc: Adding documentation to standard test dataset

* A testing sample without trend

* test: Saving some standard distribution params

Help to speed up tests. Re-use these standard params as much as possible
and isolate other tests.

* test: Simplifying tests

Re-use params if the goal is to test something else.

* test: Using pytest's tmp_path

Reduce coding.

* test, fix: Handler don't accept Path, but string

* test: Simplifying test_handler_qdm_bc()

* doc: More documentation on tests

* test: identity QDM

* fix: Must copy reference or it was a softlink

* test: Constant model, offset with reference

* test: Standard setup should result in some correction

* clean: Unused import

* test: identity relative & absolute

Both cases should result in no change correction.

* Improving documentation

* doc: Expanding documentation for QuantileDeltaMappingCorrection

* doc: Constant model tests

* style: For now, a single line doc

* doc: More on QuantileDeltaMappingCorrection

* test, refactor: Just moving tests around

The constant model is a more intuitive case and good next case after
identity.

* doc: __init_out__()

* doc: bias_transforms.local_qdm_bc()

* Initiating ruff to keep consistent with what is used here

* typo, doc: local_qdm_bc()

* test, doc: More description on the tests concepts

* test: test_bc_trend_same_hist()

* refactor: Renaming '*_CDF' to '*_params'

* Extending ruff's setup

* doc: QuantileDeltaMappingCorrection.get_qdm_params()

* doc: [WIP] run()

* style:

* fix: Removing empty line

* doc: Correct syntax to function

* fix: Exit context and return

* doc: Improving links to other resources (DataRetrievalBase)

* typo:

* refactor: Clarify transformations required to use rex

Since rex assumes a different data structure and we use regular numpy
arrays, we have to orient our data when sending, and re-orient it on the
way back. This commit just make these transformations a little more
easier to follow on the price of a somehow larger memory footprint.

* test: Adding range check as suggested by @grantbuster

* test: All finite or none, can't be both

* test: Downgrading scope to module level

* doc: Documenting qdm_bc()

* doc: Fixing link/reference to Cannon 2015

* doc: Improving QuantileDeltaMappingCorrection documentation

* doc: Using Reference

* doc: Improving documentation everywhere on bias_calc

* doc: QuantileDeltaMappingCorrection.run()

* doc: Minimalist example for local_qdm_bc()

* fix, doc: Wrong syntax for rst

* feat: _expand_paths()

Used to expand (from wildcards) single of multiple paths.

* style: super-linter wasn't happy with lambda

* doc: Changing example to illustrate better possibilties

* Adding option no_trend to QDM

This allows using the same procedure for an ordinary Delta Mapping,
reproducing rex's QDM design.

* test: Increasing noise and changing offset

The offset was too close to the bias offset, so this will help to
distinguish between both.

* test: Using a normal random noise instead

* test, doc: Better info on the reference data

* test: test_qdm_transform_notrend()

* doc: A warning on the concept of no trend

* style:

* fix: Remove type hint

The easiest way to allow running with older Python.

* style: Breaking single line in multiple steps

* meta: Adding info on datasets path

As suggested by @grantbuster.

* fix: Misleading log statement

It doesn't correct at this point, but just estimate the statistical
distributions.

* fix: Making Python-3.8 happy (removing type)

* style: Combining multiple `isinstance`

* refactor: Isolating common part of get_factors()

* style:

* test: Validating get_spatial_bc_factors() transition

I'm getting some strange errors testing locally. Let's check this.

* fix, test: Forgot to add equal_nan

* refactor: Diverting to _get_factors()

* doc: Documenting get_spatial_bc_quantiles()

* doc: Improving doc for get_spatial_bc_quantiles()

* clean: get_spatial_bc_factors()

* clean: get_spatial_bc_quantiles()

* Adding ruff rules to ignore

Just copy-n-paste @bnb32 's definitions.

* refactor: Using rex's property to load distributions metadata
castelao added a commit that referenced this pull request Apr 18, 2024
Missed those issues on PR #200. Somehow those didn't show up in the last
check before commit it.
castelao added a commit that referenced this pull request Apr 18, 2024
Fixing minor issues left from PR #200
github-actions bot pushed a commit that referenced this pull request Apr 18, 2024
Fixing minor issues left from PR #200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants