Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add budget module and regional tracer budget computation #12

Closed
wants to merge 17 commits into from

Conversation

bradyrx
Copy link
Contributor

@bradyrx bradyrx commented Jun 3, 2019

Per discussions with @matt-long, this is a first implementation of a tracer budget on the POP grid. This first PR just computes a regional budget. I.e., it expects the user to input some POP horizontal mask and some depth level (or full depth) to calculate the tracer budget over.

I'm looking for a first pass code review here with any technique/optimization suggestions. I'll generalize this for all tracers (DIC, DIC_ALT_CO2, DOC, Fe, O2, temperature, salinity) while implementing suggested changes for this PR.

Reviews/comments would be helpful from @matt-long, @andersy005, @sridge and any others.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 3, 2019

Discussion

  1. I'm looking for suggestions on how to handle the file input better. I figure intake-esm would be ideal here, but I was having trouble following the documentation to get it to work here. I currently have it hard-coded to expect that all tracer flux files are in one directory and it's for one simulation. Along these lines, there should probably be a file check at the beginning of the function run to make sure all needed output are in the directory.
  2. This doesn't explicitly handle the overflow parameterization. That's probably better to implement in another PR.
  3. Not sure how to implement reasonable testing (i.e., pytest) without bogging down the CI. Perhaps having some tracer flux output subset to a small grid and loading that in?
  4. This currently doesn't calculate GM or submesoscale terms. It doesn't seem common to have these terms (isop/subm) saved out for tracers. If someone could point me to a directory with tracer flux output for these terms, I could implement it, or we could do that in another PR.
  5. Should we add something like progressbar for the user? This runs reasonably long (2.5 - 5 minutes from my tests) and might be helpful to see where it's at along the way.
  6. This should be pretty easy to adapt to a global budget, but might also be best for another PR to avoid ballooning this one too much more.

Issues

  1. I currently force dask arrays, since this should always be dealing with a large volume of data. I do this using open_mfdataset, allowing xarray to figure out the chunks on its own. I was able to make this work just through implicit dask on a single node (with .load() functions). However, using dask-jobqueue kills workers and doesn't seem to work here.
  2. dask and xarray for that matter don't seem to handle monthly -> annual resampling too well. A few files tend to come as monthly output, so we have to resample to annual to be of the same resolution as the other budget terms. I had this resampling early in the pipeline, and kept getting MemoryErrors. .groupby('time.year').mean('time' essentially just runs a for loop on the dataset, so it's actually more efficient currently to compute the budget at monthly resolution and then handle temporal resampling at the end.

Usage

from pop_tools import regional_tracer_budget

# directory with output
basepath = '/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast' 
# base string for filename
filebase = 'g.e11_LENS.GECOIAF.T62_g16.009.pop.h'
tracer = 'DIC'
# max depth of budget from surface, in m
budget_depth = 100 

ds = regional_tracer_budget(basepath, filebase, tracer, 'POP_gx1v7', mask_int=6, budget_depth=budget_depth)

Testing

  1. California Current LME to 100m

Below is the sum of the integrated budget terms vs. integrating tend_zint_100m_DIC over the same domain.

california_current_dic_budget

Budget Sum: 0.0012 +/- 0.0123 PgC/yr

DIC Tendency Output: 0.0011 +/- 0.0123 PgC/yr

Also did testing for REGION_MASK Pacific and Atlantic oceans, and an equatorial Pacific mask. Don't have plots/numbers for that since Cheyenne and the compute nodes are unreliable right now.

  1. California Current LME spatial budget

There is a keyword sum_area that defaults to True. When input as False, the user gets the depth-integrated budget for their region instead of the full volume-integrated.

Each plot is on the same colorbar, just using xarray's default plotting method. The first row includes lateral and vertical advection, while the second drops those two terms.

Screen Shot 2019-06-03 at 1 07 37 PM

Screen Shot 2019-06-03 at 1 08 05 PM

@codecov-io
Copy link

codecov-io commented Jun 3, 2019

Codecov Report

Merging #12 into master will decrease coverage by 17.86%.
The diff coverage is 13.87%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master      #12       +/-   ##
===========================================
- Coverage   61.01%   43.14%   -17.87%     
===========================================
  Files           6        6               
  Lines         495      598      +103     
===========================================
- Hits          302      258       -44     
- Misses        193      340      +147
Impacted Files Coverage Δ
pop_tools/__init__.py 80% <100%> (ø) ⬆️
pop_tools/budget.py 13.37% <13.37%> (ø)
pop_tools/config.py 91.48% <0%> (-0.18%) ⬇️
pop_tools/region_masks.py

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6e0a11d...e2f7133. Read the comment docs.

@andersy005
Copy link
Contributor

@bradyrx

I'm looking for suggestions on how to handle the file input better. I figure intake-esm would be ideal here, but I was having trouble following the documentation to get it to work here. I currently have it hard-coded to expect that all tracer flux files are in one directory and it's for one simulation.

intake-esm's documentation could use some love :). For the last 5 months, intake-esm's development has been a moving target. It's time for a thorough, overdue refactoring since theres'a growing number of stakeholders who are interested in using it. We are planning on overhauling intake-esm in the coming weeks, and hopefully this will help alleviate some pain points of using it. It would be nice to have some clear documentation around to save one some time messing around with it.

Along these lines, there should probably be a file check at the beginning of the function run to make sure all needed output are in the directory.

The current implementation assumes that all input files live in the same directory (due to open_mfdataset()). Doesn't this assumption introduce a burden to the user in case their input files are in different locations since they would need to move all their input files into the same directory? How extendable would this current approach be in cases when a user is interested in computing tracer budget for more than one simulation?

Wouldn't it be more beneficial to decouple the I/O component from the computation itself? Is it possible for the functions to allow passing in a dataset instead of file paths? With this approach, we could dedicate efforts in handling the I/O component via intake-esm and intake-esm is/would be an ideal solution since it doesn't care about files co-location.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 4, 2019

Glad to hear that about intake-esm. I'm excited to see that overhaul so I can get a better sense of how to use it in my work.

The current implementation assumes that all input files live in the same directory (due to open_mfdataset())

So the single directory is really forced by my basepath implementation. I use open_mfdataset since it forces dask arrays and autochunks.

Doesn't this assumption introduce a burden to the user in case their input files are in different locations since they would need to move all their input files into the same directory?

Yes, and I agree with your ultimate suggestion here because of this. The current I/O setup was to get the ball rolling for my use-case but I think it's best to not even handle I/O here as you mention and just have the pre-processing done by the user.

How extendable would this current approach be in cases when a user is interested in computing tracer budget for more than one simulation?

This should be a pretty easy implementation. Since we know POP dimension names, operations are done on named dimensions here (e.g., z_t, nlat, nlon) so it should work out of the box right now with an ensemble dimension, for instance.

Wouldn't it be more beneficial to decouple the I/O component from the computation itself? Is it possible for the functions to allow passing in a dataset instead of file paths? With this approach, we could dedicate efforts in handling the I/O component via intake-esm and intake-esm is/would be an ideal solution since it doesn't care about files co-location.

Yep, this seems like the right path. The only thing that concerns me is the amount of pre-processing demanded of the user in this case. But budget calculations are less of a 'mindless' analysis task so to speak that one cranks out. It generally demands a lot of prep work and time on the user's end, hence why this is a module Matt and I talked about implementing.

I think with careful documentation/examples, we can ensure that the user passes in the right dataset for this function. Although it might require some renaming and digging around by the user. I'll see if I can try a mock up of this approach tomorrow.

@bradyrx bradyrx force-pushed the add_regional_tracer_budget branch from 11c5867 to 0c1a450 Compare June 4, 2019 18:25
@andersy005
Copy link
Contributor

Not sure how to implement reasonable testing (i.e., pytest) without bogging down the CI. Perhaps having some tracer flux output subset to a small grid and loading that in?

We are currently looking into ways to address this dataset issue for testing purposes.. Not sure how long it will take to get a working/long-term solution.

@matt-long, do you know if there is any data we could pull from an SVN repo for the time being?

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 4, 2019

@andersy005, I removed the dataset loading procedure in the most recent commit. However, I can't get this to run without a MemoryError at this point.

My output comes in of size (816, 60, 384, 320) so I chunk it into time: 10, z_t: 10 , nlat: 100, nlon: 100 (chunks of size 1,000,000). I've tried a few chunking configurations, and no matter what, I get a MemoryError during the computation.

The PBSCluster approach doesn't work either. Some way this is configured causes a KilledWorker every time.

I can get the previous commit (prior to this force-push) to run, since it does I/O for each computation task. I.e., there's never a point where it's managing something like this:

Screen Shot 2019-06-04 at 12 49 26 PM

I figure this should be well within dask's wheelhouse. Do you have much experience managing something like this? There might be a simple fix to handle all this... I tried using .load() at each procedure, at the end, etc. I tried moving the monthly -> annual resampling to different points. Not much seems to work.

I'd be happy to point you to a directory/load script to load in the files I've been using here.

@sridge
Copy link

sridge commented Jun 4, 2019

@bradyrx I remember changing dask config to get this working when I was doing budgets with my CESM tracer budget notebook. Try turning off "spill to disk" in your dask config and give these settings a go:

dask/dask#3565

@andersy005
Copy link
Contributor

You may need to monitor dask workers' memory via the dashboard. It's most likely that the error is also coming from bigger chunks that can't fit in the individual dask worker memory. What's the memory size for an individual worker in the cluster?

pop_tools/budget.py Outdated Show resolved Hide resolved
@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 4, 2019

I remember changing dask config to get this working when I was doing budgets with my CESM tracer budget notebook. Try turning off "spill to disk" in your dask config and give these settings a go

@sridge, I've had all these settings set for these tests in ~/.config/dask/jobqueue.yml. So that doesn't seem to be the problem here.

You may need to monitor dask workers' memory via the dashboard. It's most likely that the error is also coming from bigger chunks that can't fit in the individual dask worker memory. What's the memory size for an individual worker in the cluster?

@andersy005 , I tried the PBSCluster command that Sean used in his notebook:

cluster = PBSCluster(
                     processes=18,
                     memory="6GB",
                     queue='premium',
                     resource_spec='select=1:ncpus=36:mem=109G',
                     walltime='1:00:00')
cluster.start_workers(16)

So this implies 6GB per worker which should be plenty for the chunk I'm giving it. I can double-check the dashboard and see their memory limits.

@sridge
Copy link

sridge commented Jun 4, 2019

This could fall under this open issue: pydata/xarray#1832

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 4, 2019

This could fall under this open issue: pydata/xarray#1832

Yep this looks pretty similar. Looks like there are potential solutions there. I just threw a bunch of workers at the old script (96298ed) and got a KilledWorker during the _compute_SMS which is probably due to the ds.groupby() to convert from monthly to annual (which is referenced in the issue you linked).

This is using the following command, yielding 36 workers with 5GB memory_limit. I borrowed this from a notebook from @emaroon. I'm getting confused in some of these PBSCluster keywords here. I'll try again with a few workers with a huge memory allocation if I can figure it out. I don't think my chunks should exceed 5GB, though.

numnodes=4
cluster = PBSCluster(local_directory = '/glade/scratch/rbrady/spill',
                     cores=36,
                     processes=9,
                     memory="50GB",
                     queue='regular',
                     walltime='1:00:00')

from dask.distributed import Client
client = Client(cluster)
cluster.scale(numnodes*9)

Screen Shot 2019-06-04 at 1 29 41 PM

pop_tools/budget.py Outdated Show resolved Hide resolved
@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 4, 2019

Okay did a lot of work on this today and it's certainly pydata/xarray#1832. This must be why you commented out J_ in your notebook @sridge (http://pangeo.io/use_cases/climate-modeling/CESM-Large-Ensemble-Tracer-Budget-Example.html#gallery-use-cases-climate-modeling-cesm-large-ensemble-tracer-budget-example-ipynb).

xarray/dask doesn't handle temporal resampling (via groupby) very well at all. So when we hit this line:

174 ds = ds.groupby('time.year').mean('time').rename({'year': 'time'})

It breaks. I gave it 9 workers with 11GB memory each, and it seems like it threw this all on one worker and tried to load the whole thing into memory (outlined in @rabernat's issue). I watched one worker's memory climb rapidly up to 11GB and beyond before crashing.

The odd thing is that this seemed to work fine on a single node with dask arrays. I ran 96298ed on one node with 12 CPUs (no extra memory requested) and it handled the California Current mask in ~3 minutes. This includes the temporal resampling of J_DIC.

I also ran the code with the 9-worker 11GB PBSCluster and commented out the SMS (J_DIC) calculation and got it to run in ~1 minute.

So, in short, the distributed cluster approach breaks with monthly 3D data. Although a single big-memory node approach with chunking seems to handle it fine. This is only an issue for SMS terms, which are included in a lot of tracer budgets. What's the best path forward? We could ask the user to pre-process their SMS term into annual potentially.

Although note the above pertains to 96298ed. I haven't gotten the most recent commit to run on a single node due to memory errors with having the full ds pre-loaded. I still need to try this with a PBSCluster and the SMS computation commented out.

@rabernat
Copy link
Contributor

rabernat commented Jun 5, 2019

I would be happy to have a quick look at the resampling issue you’re having. There are a couple tricks one can try.

Can you explain concisely how I could reproduce the problem case?

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 5, 2019

@rabernat, thanks for the willingness here. Working on a reproducible example helped me figure out what was going on. Initially, I was constructing my dataset using open_mfdataset (with the pre-process script) to auto-chunk and then concatenating from there. For some reason, when I build it up by using open_dataset with manual chunking, everything runs smoothly.

With the former, it was trying to load the entire dataset into memory it seems and would put the resampling load onto one worker. Not exactly sure why this works, but the problem is solved. The time resampling moves quickly enough for the purposes of this module.

@rabernat
Copy link
Contributor

rabernat commented Jun 6, 2019

. Working on a reproducible example helped me figure out what was going on.

🏆 An important lesson here! 😄

In general, a lot of care needs to be made in properly setting up large aggregated multifile datasets. (Coordinates can be another source of trouble; I often call .reset_coords(drop=True) before computations.)

pydata/xarray#1832 has largely been resolved by recent improvements to dask and should probably be closed. It can be a red herring.

pop_tools/budget.py Outdated Show resolved Hide resolved
pop_tools/budget.py Outdated Show resolved Hide resolved
pop_tools/budget.py Outdated Show resolved Hide resolved
@andersy005
Copy link
Contributor

@bradyrx, for testing purposes, can we use data in https://github.com/NCAR/pop-tools/blob/master/tests/data/cesm_pop_monthly.T62_g17.nc?

@rabernat
Copy link
Contributor

rabernat commented Jun 7, 2019

I just had a look through this code. Overall I like the approach, and I think this will be very useful.

However, I wanted to comment that you seem to be reinventing lots of what xgcm is designed to do. Xgcm is meant to help work with finite volume data on Arakawa grids with a few simple operators. For a budget, see this example:

Several people have used xgcm successfully with POP. It does require some modifications to the datasets however. Here is a gist which shows how that works:

I know it would require some additional effort to refactor your code to use xgcm. However, in the long run I believe it could have major benefits. Ultimately our goal with xgcm is to be able to write the same (or at least very similar) analysis code against many different models. As long as effort is being made to develop a new "pop-tools" model, I think it could make sense to have xgcm as an upstream dependency. Otherwise, you are going to reinventing lots of wheels.

I would be happy to consult on how to approach this if you're interested.

vadv = _compute_vertical_advection(ds, grid, mask, kmax=kmax)
lmix = _compute_lateral_mixing(ds, grid, mask, kmax=kmax)
vmix = _compute_vertical_mixing(ds, grid, mask, kmax=kmax)
reg_budget = xr.merge([ladv, vadv, lmix, vmix])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were to use xgcm, you could replace the above with something like

grid = xgcm.Grid(grid_ds)
ladv = -grid.diff(adv_flux_x, 'X') - grid.diff(adv_flux_y, 'Y')
vadv = -grid.diff(adv_flux_z, 'Z')

etc.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 7, 2019

for testing purposes, can we use data in https://github.com/NCAR/pop-tools/blob/master/tests/data/cesm_pop_monthly.T62_g17.nc?

@andersy005, this won't work unfortunately. We need a specific set of tracer flux variables, since this is a pretty specialized case here. To test all functions, we could use DIC. We'd need ['UE_DIC', 'VN_DIC', 'WT_DIC', 'HDIFN_DIC', 'HDIFE_DIC', 'HDIFB_DIC', 'KPP_SRC_DIC', 'DIA_IMPVF_DIC', 'FG_CO2', 'J_DIC', 'FvPER_DIC', 'FvICE_DIC'].

One option would be for me to subset a small equatorial Pacific domain. Say -5S-5N, 100m depth, and some small span of longitude. That could bring the file size down pretty small and we could test on that file specifically. We'd also add 'tend_zint_100m_DIC' over the same domain. We could have an assert clause in the testing that our approximation is within some epsilon of the tendency term.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 7, 2019

I know it would require some additional effort to refactor your code to use xgcm. However, in the long run I believe it could have major benefits. Ultimately our goal with xgcm is to be able to write the same (or at least very similar) analysis code against many different models. As long as effort is being made to develop a new "pop-tools" model, I think it could make sense to have xgcm as an upstream dependency. Otherwise, you are going to reinventing lots of wheels.

@rabernat, thanks for taking the time to look through the code here and for the suggestion. This looks like it could be very useful and would smooth things out for future development on this module (e.g., global budget, overflow parameterizations, GM/submesoscale).

I'll look through the examples you gave and branch off of this PR to see if I can make it work. I'll follow up if it seems promising and I hit any roadblocks.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 9, 2019

@matt-long @andersy005 , I'm going to branch off this PR to try out xgcm this week.

@andersy005
Copy link
Contributor

@bradyrx, sounds good. Could you point me to some dataset on glade that I can use to put together sample datasets for testing purposes?

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 10, 2019

@andersy005, some single-simulation output is available here: /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast

I'd say to start by grabbing just DIC, since if it works, all other BGC tracers should work. You can also build a T/S one once I implement the proper conversions there (still need to do this).

Files needed:

  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.UE_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.VN_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.WT_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFN_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFE_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.HDIFB_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.DIA_IMPVF_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.ecosys.nyear1.KPP_SRC_DIC.0249-0316.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.FvICE_DIC.024901-031612.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.FvPER_DIC.024901-031612.nc
  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.J_DIC.024901-031612.nc

To assert that budget is within epsilon of the saved out tendency:

  • g.e11_LENS.GECOIAF.T62_g16.009.pop.h.tend_zint_100m_DIC.024901-031612.nc

I would just subset some box over the equatorial Pacific, and cut down to just 100m (since the tendency assertion is over 100m). You also probably only need a few years of output.

@bradyrx
Copy link
Contributor Author

bradyrx commented Jun 10, 2019

@andersy005, I added an example notebook to be compiled by nbsphinx. It should hold with the xgcm edits I try, since those will just be internal (and shouldn't affect high-level calls). This passed the make html build since I kept the cells executed (nbsphinx won't handle dask-jobqueue and the compile time would be quite slow).

Here's a gist of it now for reference: https://nbviewer.jupyter.org/gist/bradyrx/d09ee61672e503721f20ef161ee98ade

Copy link
Contributor

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradyrx, thank you for the dataset pointers. I will look into putting together a temporary dataset we can use for testing as we wait for a long term solution.

Can you update docs/source/api.rst with an entry forregional_tracer_budget?

ds = ds.rename({'z_w_bot': 'z_t'})

# Force datetime
if not np.issubdtype(ds.time.dtype, np.datetime64):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bradyrx, we can extend this to support datasets with a time coordinate decoded using cftime._cftime.datetime which extends to calendars other than the proleptic Gregorial calendar.

Xarray now supports cftime._cftime.datetime based data types for groupby and resample operations to the same extent as for datetime/numpy.datetime based data types.

In [26]: ds.time
Out[26]: 
<xarray.DataArray 'time' (time: 24)>
array([cftime.DatetimeNoLeap(1, 2, 1, 0, 0, 0, 0, 4, 32),
       cftime.DatetimeNoLeap(1, 3, 1, 0, 0, 0, 0, 4, 60),
       cftime.DatetimeNoLeap(1, 4, 1, 0, 0, 0, 0, 0, 91),
       cftime.DatetimeNoLeap(1, 5, 1, 0, 0, 0, 0, 2, 121),
       cftime.DatetimeNoLeap(1, 6, 1, 0, 0, 0, 0, 5, 152),
       cftime.DatetimeNoLeap(1, 7, 1, 0, 0, 0, 0, 0, 182),
      .........
       cftime.DatetimeNoLeap(2, 9, 1, 0, 0, 0, 0, 0, 244),
       cftime.DatetimeNoLeap(2, 10, 1, 0, 0, 0, 0, 2, 274),
       cftime.DatetimeNoLeap(2, 11, 1, 0, 0, 0, 0, 5, 305),
       cftime.DatetimeNoLeap(2, 12, 1, 0, 0, 0, 0, 0, 335),
       cftime.DatetimeNoLeap(3, 1, 1, 0, 0, 0, 0, 3, 1)], dtype=object)
Coordinates:
  * time     (time) object 0001-02-01 00:00:00 ... 0003-01-01 00:00:00
Attributes:
    long_name:  model time
    bounds:     time_bounds

Cftime time values are stored in numpy array with dtype='object'.

In [27]: ds.time.dtype
Out[27]: dtype('O')

The only way to test for cftime class that I know of, consists of retrieving one value from the array :

In [28]: isinstance(ds.time.data[0], cftime._cftime.datetime)
Out[28]: True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note from a conversation with @mclong, this will be better moved to a helper function to help users construct the dataset. I think this was only here to handle temporal resampling, which will be taken away in this PR.

@bradyrx
Copy link
Contributor Author

bradyrx commented Oct 16, 2019

Here is a nice notebook by @jbusecke using xgcm on CMIP6 models. This might be helpful for a future PR to convert this over to xgcm in coordination with #21.

https://nbviewer.jupyter.org/github/jbusecke/cmip6-hackathon-demos/blob/master/notebooks/xgcm_exmples.ipynb

For now it's probably best for me to incorporate suggestions and get this working with some test data, then convert to xgcm compatible and ensure that the results are the same.

@jbusecke
Copy link

jbusecke commented Oct 16, 2019 via email

@bradyrx bradyrx force-pushed the add_regional_tracer_budget branch from e2f7133 to 301e9a3 Compare May 28, 2020 18:42
@bradyrx
Copy link
Contributor Author

bradyrx commented May 28, 2020

@dcherian, @klindsay28, @mnlevy1981 ignore the auto-generated review request for now. Just rebased to master since this is very old.

Going to go through the comments and revise this and prepare for merging.

@dcherian
Copy link
Contributor

This will probably be a lot easier if we use xgcm with metrics so maybe tackle #40 first?

@bradyrx
Copy link
Contributor Author

bradyrx commented May 28, 2020

@dcherian , I'll dig into that. I figured I'd open a new PR and switch to xgcm based now that we have pop_tools.to_xgcm_grid_dataset() and check that it's BFB with the solutions I got here manually.

@bradyrx
Copy link
Contributor Author

bradyrx commented May 29, 2020

Closing this for now, since it's been polluting the PR list for awhile. Will be taking a look at #40 as a path forward, and will address comments here in putting together a new budget PR.

@bradyrx bradyrx closed this May 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants