add budget module and regional tracer budget computation #12

bradyrx · 2019-06-03T22:05:57Z

Per discussions with @matt-long, this is a first implementation of a tracer budget on the POP grid. This first PR just computes a regional budget. I.e., it expects the user to input some POP horizontal mask and some depth level (or full depth) to calculate the tracer budget over.

I'm looking for a first pass code review here with any technique/optimization suggestions. I'll generalize this for all tracers (DIC, DIC_ALT_CO2, DOC, Fe, O2, temperature, salinity) while implementing suggested changes for this PR.

Reviews/comments would be helpful from @matt-long, @andersy005, @sridge and any others.

bradyrx · 2019-06-03T22:06:25Z

Discussion

I'm looking for suggestions on how to handle the file input better. I figure intake-esm would be ideal here, but I was having trouble following the documentation to get it to work here. I currently have it hard-coded to expect that all tracer flux files are in one directory and it's for one simulation. Along these lines, there should probably be a file check at the beginning of the function run to make sure all needed output are in the directory.
This doesn't explicitly handle the overflow parameterization. That's probably better to implement in another PR.
Not sure how to implement reasonable testing (i.e., pytest) without bogging down the CI. Perhaps having some tracer flux output subset to a small grid and loading that in?
This currently doesn't calculate GM or submesoscale terms. It doesn't seem common to have these terms (isop/subm) saved out for tracers. If someone could point me to a directory with tracer flux output for these terms, I could implement it, or we could do that in another PR.
Should we add something like progressbar for the user? This runs reasonably long (2.5 - 5 minutes from my tests) and might be helpful to see where it's at along the way.
This should be pretty easy to adapt to a global budget, but might also be best for another PR to avoid ballooning this one too much more.

Issues

I currently force dask arrays, since this should always be dealing with a large volume of data. I do this using open_mfdataset, allowing xarray to figure out the chunks on its own. I was able to make this work just through implicit dask on a single node (with .load() functions). However, using dask-jobqueue kills workers and doesn't seem to work here.
dask and xarray for that matter don't seem to handle monthly -> annual resampling too well. A few files tend to come as monthly output, so we have to resample to annual to be of the same resolution as the other budget terms. I had this resampling early in the pipeline, and kept getting MemoryErrors. .groupby('time.year').mean('time' essentially just runs a for loop on the dataset, so it's actually more efficient currently to compute the budget at monthly resolution and then handle temporal resampling at the end.

Usage

from pop_tools import regional_tracer_budget

# directory with output
basepath = '/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast' 
# base string for filename
filebase = 'g.e11_LENS.GECOIAF.T62_g16.009.pop.h'
tracer = 'DIC'
# max depth of budget from surface, in m
budget_depth = 100 

ds = regional_tracer_budget(basepath, filebase, tracer, 'POP_gx1v7', mask_int=6, budget_depth=budget_depth)

Testing

California Current LME to 100m

Below is the sum of the integrated budget terms vs. integrating tend_zint_100m_DIC over the same domain.

Budget Sum: 0.0012 +/- 0.0123 PgC/yr

DIC Tendency Output: 0.0011 +/- 0.0123 PgC/yr

Also did testing for REGION_MASK Pacific and Atlantic oceans, and an equatorial Pacific mask. Don't have plots/numbers for that since Cheyenne and the compute nodes are unreliable right now.

California Current LME spatial budget

There is a keyword sum_area that defaults to True. When input as False, the user gets the depth-integrated budget for their region instead of the full volume-integrated.

Each plot is on the same colorbar, just using xarray's default plotting method. The first row includes lateral and vertical advection, while the second drops those two terms.

codecov-io · 2019-06-03T22:10:10Z

Codecov Report

Merging #12 into master will decrease coverage by 17.86%.
The diff coverage is 13.87%.

@@             Coverage Diff             @@
##           master      #12       +/-   ##
===========================================
- Coverage   61.01%   43.14%   -17.87%     
===========================================
  Files           6        6               
  Lines         495      598      +103     
===========================================
- Hits          302      258       -44     
- Misses        193      340      +147

Impacted Files	Coverage Δ
pop_tools/__init__.py	`80% <100%> (ø)`	⬆️
pop_tools/budget.py	`13.37% <13.37%> (ø)`
pop_tools/config.py	`91.48% <0%> (-0.18%)`	⬇️
pop_tools/region_masks.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e0a11d...e2f7133. Read the comment docs.

andersy005 · 2019-06-04T02:46:15Z

@bradyrx

I'm looking for suggestions on how to handle the file input better. I figure intake-esm would be ideal here, but I was having trouble following the documentation to get it to work here. I currently have it hard-coded to expect that all tracer flux files are in one directory and it's for one simulation.

intake-esm's documentation could use some love :). For the last 5 months, intake-esm's development has been a moving target. It's time for a thorough, overdue refactoring since theres'a growing number of stakeholders who are interested in using it. We are planning on overhauling intake-esm in the coming weeks, and hopefully this will help alleviate some pain points of using it. It would be nice to have some clear documentation around to save one some time messing around with it.

Along these lines, there should probably be a file check at the beginning of the function run to make sure all needed output are in the directory.

The current implementation assumes that all input files live in the same directory (due to open_mfdataset()). Doesn't this assumption introduce a burden to the user in case their input files are in different locations since they would need to move all their input files into the same directory? How extendable would this current approach be in cases when a user is interested in computing tracer budget for more than one simulation?

Wouldn't it be more beneficial to decouple the I/O component from the computation itself? Is it possible for the functions to allow passing in a dataset instead of file paths? With this approach, we could dedicate efforts in handling the I/O component via intake-esm and intake-esm is/would be an ideal solution since it doesn't care about files co-location.

bradyrx · 2019-06-04T04:17:48Z

Glad to hear that about intake-esm. I'm excited to see that overhaul so I can get a better sense of how to use it in my work.

The current implementation assumes that all input files live in the same directory (due to open_mfdataset())

So the single directory is really forced by my basepath implementation. I use open_mfdataset since it forces dask arrays and autochunks.

Doesn't this assumption introduce a burden to the user in case their input files are in different locations since they would need to move all their input files into the same directory?

Yes, and I agree with your ultimate suggestion here because of this. The current I/O setup was to get the ball rolling for my use-case but I think it's best to not even handle I/O here as you mention and just have the pre-processing done by the user.

How extendable would this current approach be in cases when a user is interested in computing tracer budget for more than one simulation?

This should be a pretty easy implementation. Since we know POP dimension names, operations are done on named dimensions here (e.g., z_t, nlat, nlon) so it should work out of the box right now with an ensemble dimension, for instance.

Wouldn't it be more beneficial to decouple the I/O component from the computation itself? Is it possible for the functions to allow passing in a dataset instead of file paths? With this approach, we could dedicate efforts in handling the I/O component via intake-esm and intake-esm is/would be an ideal solution since it doesn't care about files co-location.

Yep, this seems like the right path. The only thing that concerns me is the amount of pre-processing demanded of the user in this case. But budget calculations are less of a 'mindless' analysis task so to speak that one cranks out. It generally demands a lot of prep work and time on the user's end, hence why this is a module Matt and I talked about implementing.

I think with careful documentation/examples, we can ensure that the user passes in the right dataset for this function. Although it might require some renaming and digging around by the user. I'll see if I can try a mock up of this approach tomorrow.

andersy005 · 2019-06-04T18:42:51Z

Not sure how to implement reasonable testing (i.e., pytest) without bogging down the CI. Perhaps having some tracer flux output subset to a small grid and loading that in?

We are currently looking into ways to address this dataset issue for testing purposes.. Not sure how long it will take to get a working/long-term solution.

@matt-long, do you know if there is any data we could pull from an SVN repo for the time being?

bradyrx · 2019-06-04T18:51:12Z

@andersy005, I removed the dataset loading procedure in the most recent commit. However, I can't get this to run without a MemoryError at this point.

My output comes in of size (816, 60, 384, 320) so I chunk it into time: 10, z_t: 10 , nlat: 100, nlon: 100 (chunks of size 1,000,000). I've tried a few chunking configurations, and no matter what, I get a MemoryError during the computation.

The PBSCluster approach doesn't work either. Some way this is configured causes a KilledWorker every time.

I can get the previous commit (prior to this force-push) to run, since it does I/O for each computation task. I.e., there's never a point where it's managing something like this:

I figure this should be well within dask's wheelhouse. Do you have much experience managing something like this? There might be a simple fix to handle all this... I tried using .load() at each procedure, at the end, etc. I tried moving the monthly -> annual resampling to different points. Not much seems to work.

I'd be happy to point you to a directory/load script to load in the files I've been using here.

sridge · 2019-06-04T18:58:49Z

@bradyrx I remember changing dask config to get this working when I was doing budgets with my CESM tracer budget notebook. Try turning off "spill to disk" in your dask config and give these settings a go:

dask/dask#3565

andersy005 · 2019-06-04T19:04:39Z

You may need to monitor dask workers' memory via the dashboard. It's most likely that the error is also coming from bigger chunks that can't fit in the individual dask worker memory. What's the memory size for an individual worker in the cluster?

pop_tools/budget.py

bradyrx · 2019-06-04T19:15:03Z

I remember changing dask config to get this working when I was doing budgets with my CESM tracer budget notebook. Try turning off "spill to disk" in your dask config and give these settings a go

@sridge, I've had all these settings set for these tests in ~/.config/dask/jobqueue.yml. So that doesn't seem to be the problem here.

You may need to monitor dask workers' memory via the dashboard. It's most likely that the error is also coming from bigger chunks that can't fit in the individual dask worker memory. What's the memory size for an individual worker in the cluster?

@andersy005 , I tried the PBSCluster command that Sean used in his notebook:

cluster = PBSCluster(
                     processes=18,
                     memory="6GB",
                     queue='premium',
                     resource_spec='select=1:ncpus=36:mem=109G',
                     walltime='1:00:00')
cluster.start_workers(16)

So this implies 6GB per worker which should be plenty for the chunk I'm giving it. I can double-check the dashboard and see their memory limits.

sridge · 2019-06-04T19:31:27Z

This could fall under this open issue: pydata/xarray#1832

bradyrx · 2019-06-04T19:39:15Z

This could fall under this open issue: pydata/xarray#1832

Yep this looks pretty similar. Looks like there are potential solutions there. I just threw a bunch of workers at the old script (96298ed) and got a KilledWorker during the _compute_SMS which is probably due to the ds.groupby() to convert from monthly to annual (which is referenced in the issue you linked).

This is using the following command, yielding 36 workers with 5GB memory_limit. I borrowed this from a notebook from @emaroon. I'm getting confused in some of these PBSCluster keywords here. I'll try again with a few workers with a huge memory allocation if I can figure it out. I don't think my chunks should exceed 5GB, though.

numnodes=4
cluster = PBSCluster(local_directory = '/glade/scratch/rbrady/spill',
                     cores=36,
                     processes=9,
                     memory="50GB",
                     queue='regular',
                     walltime='1:00:00')

from dask.distributed import Client
client = Client(cluster)
cluster.scale(numnodes*9)

pop_tools/budget.py

bradyrx · 2019-06-04T22:05:31Z

Okay did a lot of work on this today and it's certainly pydata/xarray#1832. This must be why you commented out J_ in your notebook @sridge (http://pangeo.io/use_cases/climate-modeling/CESM-Large-Ensemble-Tracer-Budget-Example.html#gallery-use-cases-climate-modeling-cesm-large-ensemble-tracer-budget-example-ipynb).

xarray/dask doesn't handle temporal resampling (via groupby) very well at all. So when we hit this line:

174 ds = ds.groupby('time.year').mean('time').rename({'year': 'time'})

It breaks. I gave it 9 workers with 11GB memory each, and it seems like it threw this all on one worker and tried to load the whole thing into memory (outlined in @rabernat's issue). I watched one worker's memory climb rapidly up to 11GB and beyond before crashing.

The odd thing is that this seemed to work fine on a single node with dask arrays. I ran 96298ed on one node with 12 CPUs (no extra memory requested) and it handled the California Current mask in ~3 minutes. This includes the temporal resampling of J_DIC.

I also ran the code with the 9-worker 11GB PBSCluster and commented out the SMS (J_DIC) calculation and got it to run in ~1 minute.

So, in short, the distributed cluster approach breaks with monthly 3D data. Although a single big-memory node approach with chunking seems to handle it fine. This is only an issue for SMS terms, which are included in a lot of tracer budgets. What's the best path forward? We could ask the user to pre-process their SMS term into annual potentially.

Although note the above pertains to 96298ed. I haven't gotten the most recent commit to run on a single node due to memory errors with having the full ds pre-loaded. I still need to try this with a PBSCluster and the SMS computation commented out.

rabernat · 2019-06-05T00:04:36Z

I would be happy to have a quick look at the resampling issue you’re having. There are a couple tricks one can try.

Can you explain concisely how I could reproduce the problem case?

bradyrx · 2019-06-05T19:40:30Z

@rabernat, thanks for the willingness here. Working on a reproducible example helped me figure out what was going on. Initially, I was constructing my dataset using open_mfdataset (with the pre-process script) to auto-chunk and then concatenating from there. For some reason, when I build it up by using open_dataset with manual chunking, everything runs smoothly.

With the former, it was trying to load the entire dataset into memory it seems and would put the resampling load onto one worker. Not exactly sure why this works, but the problem is solved. The time resampling moves quickly enough for the purposes of this module.

rabernat · 2019-06-06T19:53:51Z

. Working on a reproducible example helped me figure out what was going on.

🏆 An important lesson here! 😄

In general, a lot of care needs to be made in properly setting up large aggregated multifile datasets. (Coordinates can be another source of trouble; I often call .reset_coords(drop=True) before computations.)

pydata/xarray#1832 has largely been resolved by recent improvements to dask and should probably be closed. It can be a red herring.

pop_tools/budget.py

andersy005 · 2019-06-06T22:35:39Z

@bradyrx, for testing purposes, can we use data in https://github.com/NCAR/pop-tools/blob/master/tests/data/cesm_pop_monthly.T62_g17.nc?

rabernat · 2019-06-07T00:47:15Z

I just had a look through this code. Overall I like the approach, and I think this will be very useful.

However, I wanted to comment that you seem to be reinventing lots of what xgcm is designed to do. Xgcm is meant to help work with finite volume data on Arakawa grids with a few simple operators. For a budget, see this example:

https://xgcm.readthedocs.io/en/latest/example_eccov4.html#Spatial-Derivatives:-Heat-Budget

Several people have used xgcm successfully with POP. It does require some modifications to the datasets however. Here is a gist which shows how that works:

https://gist.github.com/rabernat/933bc785c99828352f343e48d0da6e22

I know it would require some additional effort to refactor your code to use xgcm. However, in the long run I believe it could have major benefits. Ultimately our goal with xgcm is to be able to write the same (or at least very similar) analysis code against many different models. As long as effort is being made to develop a new "pop-tools" model, I think it could make sense to have xgcm as an upstream dependency. Otherwise, you are going to reinventing lots of wheels.

I would be happy to consult on how to approach this if you're interested.

rabernat · 2019-06-07T00:57:59Z

pop_tools/budget.py

+    vadv = _compute_vertical_advection(ds, grid, mask, kmax=kmax)
+    lmix = _compute_lateral_mixing(ds, grid, mask, kmax=kmax)
+    vmix = _compute_vertical_mixing(ds, grid, mask, kmax=kmax)
+    reg_budget = xr.merge([ladv, vadv, lmix, vmix])


If you were to use xgcm, you could replace the above with something like

grid = xgcm.Grid(grid_ds) ladv = -grid.diff(adv_flux_x, 'X') - grid.diff(adv_flux_y, 'Y') vadv = -grid.diff(adv_flux_z, 'Z')

etc.

bradyrx · 2019-06-07T18:04:26Z

for testing purposes, can we use data in https://github.com/NCAR/pop-tools/blob/master/tests/data/cesm_pop_monthly.T62_g17.nc?

@andersy005, this won't work unfortunately. We need a specific set of tracer flux variables, since this is a pretty specialized case here. To test all functions, we could use DIC. We'd need ['UE_DIC', 'VN_DIC', 'WT_DIC', 'HDIFN_DIC', 'HDIFE_DIC', 'HDIFB_DIC', 'KPP_SRC_DIC', 'DIA_IMPVF_DIC', 'FG_CO2', 'J_DIC', 'FvPER_DIC', 'FvICE_DIC'].

One option would be for me to subset a small equatorial Pacific domain. Say -5S-5N, 100m depth, and some small span of longitude. That could bring the file size down pretty small and we could test on that file specifically. We'd also add 'tend_zint_100m_DIC' over the same domain. We could have an assert clause in the testing that our approximation is within some epsilon of the tendency term.

bradyrx · 2019-06-07T18:08:03Z

I know it would require some additional effort to refactor your code to use xgcm. However, in the long run I believe it could have major benefits. Ultimately our goal with xgcm is to be able to write the same (or at least very similar) analysis code against many different models. As long as effort is being made to develop a new "pop-tools" model, I think it could make sense to have xgcm as an upstream dependency. Otherwise, you are going to reinventing lots of wheels.

@rabernat, thanks for taking the time to look through the code here and for the suggestion. This looks like it could be very useful and would smooth things out for future development on this module (e.g., global budget, overflow parameterizations, GM/submesoscale).

I'll look through the examples you gave and branch off of this PR to see if I can make it work. I'll follow up if it seems promising and I hit any roadblocks.

bradyrx · 2019-06-09T20:20:32Z

@matt-long @andersy005 , I'm going to branch off this PR to try out xgcm this week.

andersy005 · 2019-06-09T20:56:16Z

@bradyrx, sounds good. Could you point me to some dataset on glade that I can use to put together sample datasets for testing purposes?

bradyrx · 2019-06-10T18:49:28Z

@andersy005, some single-simulation output is available here: /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast

I'd say to start by grabbing just DIC, since if it works, all other BGC tracers should work. You can also build a T/S one once I implement the proper conversions there (still need to do this).

Files needed:

g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.UE_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.VN_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.WT_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFN_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFE_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFB_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.DIA_IMPVF_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.KPP_SRC_DIC.0249-0316.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.FvICE_DIC.024901-031612.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.FvPER_DIC.024901-031612.nc
g.e11_LENS.GECOIAF.T62_g16.009.pop.h.J_DIC.024901-031612.nc

To assert that budget is within epsilon of the saved out tendency:

g.e11_LENS.GECOIAF.T62_g16.009.pop.h.tend_zint_100m_DIC.024901-031612.nc

I would just subset some box over the equatorial Pacific, and cut down to just 100m (since the tendency assertion is over 100m). You also probably only need a few years of output.

bradyrx · 2019-06-10T20:09:31Z

@andersy005, I added an example notebook to be compiled by nbsphinx. It should hold with the xgcm edits I try, since those will just be internal (and shouldn't affect high-level calls). This passed the make html build since I kept the cells executed (nbsphinx won't handle dask-jobqueue and the compile time would be quite slow).

Here's a gist of it now for reference: https://nbviewer.jupyter.org/gist/bradyrx/d09ee61672e503721f20ef161ee98ade

andersy005

@bradyrx, thank you for the dataset pointers. I will look into putting together a temporary dataset we can use for testing as we wait for a long term solution.

Can you update docs/source/api.rst with an entry forregional_tracer_budget?

andersy005 · 2019-06-10T21:19:55Z

pop_tools/budget.py

+        ds = ds.rename({'z_w_bot': 'z_t'})
+
+    # Force datetime
+    if not np.issubdtype(ds.time.dtype, np.datetime64):


@bradyrx, we can extend this to support datasets with a time coordinate decoded using cftime._cftime.datetime which extends to calendars other than the proleptic Gregorial calendar.

Xarray now supports cftime._cftime.datetime based data types for groupby and resample operations to the same extent as for datetime/numpy.datetime based data types.

In [26]: ds.time Out[26]: <xarray.DataArray 'time' (time: 24)> array([cftime.DatetimeNoLeap(1, 2, 1, 0, 0, 0, 0, 4, 32), cftime.DatetimeNoLeap(1, 3, 1, 0, 0, 0, 0, 4, 60), cftime.DatetimeNoLeap(1, 4, 1, 0, 0, 0, 0, 0, 91), cftime.DatetimeNoLeap(1, 5, 1, 0, 0, 0, 0, 2, 121), cftime.DatetimeNoLeap(1, 6, 1, 0, 0, 0, 0, 5, 152), cftime.DatetimeNoLeap(1, 7, 1, 0, 0, 0, 0, 0, 182), ......... cftime.DatetimeNoLeap(2, 9, 1, 0, 0, 0, 0, 0, 244), cftime.DatetimeNoLeap(2, 10, 1, 0, 0, 0, 0, 2, 274), cftime.DatetimeNoLeap(2, 11, 1, 0, 0, 0, 0, 5, 305), cftime.DatetimeNoLeap(2, 12, 1, 0, 0, 0, 0, 0, 335), cftime.DatetimeNoLeap(3, 1, 1, 0, 0, 0, 0, 3, 1)], dtype=object) Coordinates: * time (time) object 0001-02-01 00:00:00 ... 0003-01-01 00:00:00 Attributes: long_name: model time bounds: time_bounds

Cftime time values are stored in numpy array with dtype='object'.

In [27]: ds.time.dtype Out[27]: dtype('O')

The only way to test for cftime class that I know of, consists of retrieving one value from the array :

In [28]: isinstance(ds.time.data[0], cftime._cftime.datetime) Out[28]: True

Note from a conversation with @mclong, this will be better moved to a helper function to help users construct the dataset. I think this was only here to handle temporal resampling, which will be taken away in this PR.

bradyrx · 2019-10-16T15:30:11Z

Here is a nice notebook by @jbusecke using xgcm on CMIP6 models. This might be helpful for a future PR to convert this over to xgcm in coordination with #21.

https://nbviewer.jupyter.org/github/jbusecke/cmip6-hackathon-demos/blob/master/notebooks/xgcm_exmples.ipynb

For now it's probably best for me to incorporate suggestions and get this working with some test data, then convert to xgcm compatible and ensure that the results are the same.

jbusecke · 2019-10-16T18:43:26Z

EDIT: Sorry I have not grasped the whole story of this issue with an email. We are planning to get curl/div/grad as operators into xgcm. Please raise an issue over as xgcm and lets discuss there. I am planning to generalize this so we can use it as a package! Where is the best way to discuss this centrally? Julius J.M. Busecke, Ph.D. (he/him) ——————————————— Postdoctoral Research Associate Princeton University • Geosciences 408A Guyot Hall • Princeton NJ jbusecke.github.io<http://jbusecke.github.io> On Oct 16, 2019, at 11:30 AM, Riley Brady <[email protected]<mailto:[email protected]>> wrote: Here is a nice notebook by @jbusecke<https://github.com/jbusecke> using xgcm on CMIP6 models. This might be helpful for a future PR to convert this over to xgcm in coordination with #21<#21>. https://nbviewer.jupyter.org/github/jbusecke/cmip6-hackathon-demos/blob/master/notebooks/xgcm_exmples.ipynb For now it's probably best for me to incorporate suggestions and get this working with some test data, then convert to xgcm compatible and ensure that the results are the same. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub<#12?email_source=notifications&email_token=ADNGY77JQTJAFJCXBC5OOTDQO4XQNA5CNFSM4HST26A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBM5QHI#issuecomment-542758941>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADNGY74JQ7ZJASCYZTVHH53QO4XQNANCNFSM4HST26AQ>.

Compute kmax, deduce if using full depth. Convert all units to tendencies only load subset of variables

add vertical advection add lateral mixing add vertical mixing add SMS

make REGION_MASK default

…function

bradyrx · 2020-05-28T18:44:26Z

@dcherian, @klindsay28, @mnlevy1981 ignore the auto-generated review request for now. Just rebased to master since this is very old.

Going to go through the comments and revise this and prepare for merging.

dcherian · 2020-05-28T19:34:40Z

This will probably be a lot easier if we use xgcm with metrics so maybe tackle #40 first?

bradyrx · 2020-05-28T21:09:31Z

@dcherian , I'll dig into that. I figured I'd open a new PR and switch to xgcm based now that we have pop_tools.to_xgcm_grid_dataset() and check that it's BFB with the solutions I got here manually.

bradyrx · 2020-05-29T14:51:39Z

Closing this for now, since it's been polluting the PR list for awhile. Will be taking a look at #40 as a path forward, and will address comments here in putting together a new budget PR.

bradyrx force-pushed the add_regional_tracer_budget branch from 11c5867 to 0c1a450 Compare June 4, 2019 18:25

sridge reviewed Jun 4, 2019

View reviewed changes

pop_tools/budget.py Outdated Show resolved Hide resolved

bradyrx commented Jun 4, 2019

View reviewed changes

pop_tools/budget.py Outdated Show resolved Hide resolved

andersy005 reviewed Jun 6, 2019

View reviewed changes

pop_tools/budget.py Outdated Show resolved Hide resolved

pop_tools/budget.py Outdated Show resolved Hide resolved

pop_tools/budget.py Outdated Show resolved Hide resolved

rabernat reviewed Jun 7, 2019

View reviewed changes

andersy005 reviewed Jun 10, 2019

View reviewed changes

bradyrx mentioned this pull request Jun 14, 2019

Using reset_coords? pangeo-data/climpred#188

Closed

bradyrx mentioned this pull request Sep 17, 2019

Test data for Continuous Integration #18

Closed

andersy005 mentioned this pull request Nov 6, 2019

Use Pooch for fetching sample data files and cesm inpudata files #23

Merged

bradyrx added 15 commits May 28, 2020 12:41

Initialize module and add function for loading tracer flux output

32714d3

Compute kmax, deduce if using full depth. Convert all units to tendencies only load subset of variables

add lateral advection divergence computation

c1f0772

add vertical advection add lateral mixing add vertical mixing add SMS

add surface tracer flux and annual resampling

8f4d1fe

Return a dataset and ask for integer for mask

37916c8

make REGION_MASK default

add option to sum over area or not

b17e73d

remove dataset loading

f8c0148

bugfix and refactoring

32156aa

separate virtual and tracer fluxes. process dataset that is input to …

e7bdde7

…function

minor fixes

d70c7e8

generalized for depth-integrated or depth-resolved output

58912a6

minor dataset naming fix

38ad1e8

remove print statements

836cdf7

return lazy object

f47ccad

rename SMS to sms

478396d

add example notebook for regional tracers

301e9a3

bradyrx force-pushed the add_regional_tracer_budget branch from e2f7133 to 301e9a3 Compare May 28, 2020 18:42

bradyrx requested review from dcherian, klindsay28 and mnlevy1981 as code owners May 28, 2020 18:42

bradyrx added 2 commits May 28, 2020 13:13

remove old reference to config module

9e3b1d4

update pre-commit

aa3aadb

bradyrx closed this May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add budget module and regional tracer budget computation #12

add budget module and regional tracer budget computation #12

bradyrx commented Jun 3, 2019 •

edited

Loading

bradyrx commented Jun 3, 2019 •

edited

Loading

codecov-io commented Jun 3, 2019 •

edited

Loading

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

sridge commented Jun 4, 2019

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

sridge commented Jun 4, 2019

bradyrx commented Jun 4, 2019 •

edited

Loading

bradyrx commented Jun 4, 2019 •

edited

Loading

rabernat commented Jun 5, 2019

bradyrx commented Jun 5, 2019

rabernat commented Jun 6, 2019

andersy005 commented Jun 6, 2019

rabernat commented Jun 7, 2019

rabernat Jun 7, 2019

bradyrx commented Jun 7, 2019

bradyrx commented Jun 7, 2019

bradyrx commented Jun 9, 2019

andersy005 commented Jun 9, 2019

bradyrx commented Jun 10, 2019

bradyrx commented Jun 10, 2019

andersy005 left a comment •

edited

Loading

andersy005 Jun 10, 2019

bradyrx Jul 10, 2019

bradyrx commented Oct 16, 2019

jbusecke commented Oct 16, 2019 via email •

edited

Loading

bradyrx commented May 28, 2020

dcherian commented May 28, 2020

bradyrx commented May 28, 2020

bradyrx commented May 29, 2020

add budget module and regional tracer budget computation #12

add budget module and regional tracer budget computation #12

Conversation

bradyrx commented Jun 3, 2019 • edited Loading

bradyrx commented Jun 3, 2019 • edited Loading

Discussion

Issues

Usage

Testing

codecov-io commented Jun 3, 2019 • edited Loading

Codecov Report

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

sridge commented Jun 4, 2019

andersy005 commented Jun 4, 2019

bradyrx commented Jun 4, 2019

sridge commented Jun 4, 2019

bradyrx commented Jun 4, 2019 • edited Loading

bradyrx commented Jun 4, 2019 • edited Loading

rabernat commented Jun 5, 2019

bradyrx commented Jun 5, 2019

rabernat commented Jun 6, 2019

andersy005 commented Jun 6, 2019

rabernat commented Jun 7, 2019

rabernat Jun 7, 2019

Choose a reason for hiding this comment

bradyrx commented Jun 7, 2019

bradyrx commented Jun 7, 2019

bradyrx commented Jun 9, 2019

andersy005 commented Jun 9, 2019

bradyrx commented Jun 10, 2019

bradyrx commented Jun 10, 2019

andersy005 left a comment • edited Loading

Choose a reason for hiding this comment

andersy005 Jun 10, 2019

Choose a reason for hiding this comment

bradyrx Jul 10, 2019

Choose a reason for hiding this comment

bradyrx commented Oct 16, 2019

jbusecke commented Oct 16, 2019 via email • edited Loading

bradyrx commented May 28, 2020

dcherian commented May 28, 2020

bradyrx commented May 28, 2020

bradyrx commented May 29, 2020

bradyrx commented Jun 3, 2019 •

edited

Loading

bradyrx commented Jun 3, 2019 •

edited

Loading

codecov-io commented Jun 3, 2019 •

edited

Loading

bradyrx commented Jun 4, 2019 •

edited

Loading

bradyrx commented Jun 4, 2019 •

edited

Loading

andersy005 left a comment •

edited

Loading

jbusecke commented Oct 16, 2019 via email •

edited

Loading