-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add budget module and regional tracer budget computation #12
Conversation
Discussion
Issues
Usagefrom pop_tools import regional_tracer_budget
# directory with output
basepath = '/glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast'
# base string for filename
filebase = 'g.e11_LENS.GECOIAF.T62_g16.009.pop.h'
tracer = 'DIC'
# max depth of budget from surface, in m
budget_depth = 100
ds = regional_tracer_budget(basepath, filebase, tracer, 'POP_gx1v7', mask_int=6, budget_depth=budget_depth) Testing
Below is the sum of the integrated budget terms vs. integrating Budget Sum: 0.0012 +/- 0.0123 PgC/yr DIC Tendency Output: 0.0011 +/- 0.0123 PgC/yr Also did testing for
There is a keyword Each plot is on the same colorbar, just using |
Codecov Report
@@ Coverage Diff @@
## master #12 +/- ##
===========================================
- Coverage 61.01% 43.14% -17.87%
===========================================
Files 6 6
Lines 495 598 +103
===========================================
- Hits 302 258 -44
- Misses 193 340 +147
Continue to review full report at Codecov.
|
intake-esm's documentation could use some love :). For the last 5 months, intake-esm's development has been a moving target. It's time for a thorough, overdue refactoring since theres'a growing number of stakeholders who are interested in using it. We are planning on overhauling intake-esm in the coming weeks, and hopefully this will help alleviate some pain points of using it. It would be nice to have some clear documentation around to save one some time messing around with it.
The current implementation assumes that all input files live in the same directory (due to Wouldn't it be more beneficial to decouple the I/O component from the computation itself? Is it possible for the functions to allow passing in a dataset instead of file paths? With this approach, we could dedicate efforts in handling the I/O component via |
Glad to hear that about intake-esm. I'm excited to see that overhaul so I can get a better sense of how to use it in my work.
So the single directory is really forced by my
Yes, and I agree with your ultimate suggestion here because of this. The current I/O setup was to get the ball rolling for my use-case but I think it's best to not even handle I/O here as you mention and just have the pre-processing done by the user.
This should be a pretty easy implementation. Since we know POP dimension names, operations are done on named dimensions here (e.g.,
Yep, this seems like the right path. The only thing that concerns me is the amount of pre-processing demanded of the user in this case. But budget calculations are less of a 'mindless' analysis task so to speak that one cranks out. It generally demands a lot of prep work and time on the user's end, hence why this is a module Matt and I talked about implementing. I think with careful documentation/examples, we can ensure that the user passes in the right dataset for this function. Although it might require some renaming and digging around by the user. I'll see if I can try a mock up of this approach tomorrow. |
11c5867
to
0c1a450
Compare
We are currently looking into ways to address this dataset issue for testing purposes.. Not sure how long it will take to get a working/long-term solution. @matt-long, do you know if there is any data we could pull from an SVN repo for the time being? |
@andersy005, I removed the dataset loading procedure in the most recent commit. However, I can't get this to run without a My output comes in of size (816, 60, 384, 320) so I chunk it into The I can get the previous commit (prior to this force-push) to run, since it does I/O for each computation task. I.e., there's never a point where it's managing something like this: I figure this should be well within dask's wheelhouse. Do you have much experience managing something like this? There might be a simple fix to handle all this... I tried using I'd be happy to point you to a directory/load script to load in the files I've been using here. |
@bradyrx I remember changing dask config to get this working when I was doing budgets with my CESM tracer budget notebook. Try turning off "spill to disk" in your dask config and give these settings a go: |
You may need to monitor dask workers' memory via the dashboard. It's most likely that the error is also coming from bigger chunks that can't fit in the individual dask worker memory. What's the memory size for an individual worker in the cluster? |
@sridge, I've had all these settings set for these tests in
@andersy005 , I tried the cluster = PBSCluster(
processes=18,
memory="6GB",
queue='premium',
resource_spec='select=1:ncpus=36:mem=109G',
walltime='1:00:00')
cluster.start_workers(16) So this implies 6GB per worker which should be plenty for the chunk I'm giving it. I can double-check the dashboard and see their memory limits. |
This could fall under this open issue: pydata/xarray#1832 |
Yep this looks pretty similar. Looks like there are potential solutions there. I just threw a bunch of workers at the old script (96298ed) and got a This is using the following command, yielding 36 workers with 5GB memory_limit. I borrowed this from a notebook from @emaroon. I'm getting confused in some of these PBSCluster keywords here. I'll try again with a few workers with a huge memory allocation if I can figure it out. I don't think my chunks should exceed 5GB, though. numnodes=4
cluster = PBSCluster(local_directory = '/glade/scratch/rbrady/spill',
cores=36,
processes=9,
memory="50GB",
queue='regular',
walltime='1:00:00')
from dask.distributed import Client
client = Client(cluster)
cluster.scale(numnodes*9) |
Okay did a lot of work on this today and it's certainly pydata/xarray#1832. This must be why you commented out
174 ds = ds.groupby('time.year').mean('time').rename({'year': 'time'}) It breaks. I gave it 9 workers with 11GB memory each, and it seems like it threw this all on one worker and tried to load the whole thing into memory (outlined in @rabernat's issue). I watched one worker's memory climb rapidly up to 11GB and beyond before crashing. The odd thing is that this seemed to work fine on a single node with dask arrays. I ran 96298ed on one node with 12 CPUs (no extra memory requested) and it handled the California Current mask in ~3 minutes. This includes the temporal resampling of I also ran the code with the 9-worker 11GB So, in short, the distributed cluster approach breaks with monthly 3D data. Although a single big-memory node approach with chunking seems to handle it fine. This is only an issue for SMS terms, which are included in a lot of tracer budgets. What's the best path forward? We could ask the user to pre-process their SMS term into annual potentially. Although note the above pertains to 96298ed. I haven't gotten the most recent commit to run on a single node due to memory errors with having the full |
I would be happy to have a quick look at the resampling issue you’re having. There are a couple tricks one can try. Can you explain concisely how I could reproduce the problem case? |
@rabernat, thanks for the willingness here. Working on a reproducible example helped me figure out what was going on. Initially, I was constructing my dataset using With the former, it was trying to load the entire dataset into memory it seems and would put the resampling load onto one worker. Not exactly sure why this works, but the problem is solved. The time resampling moves quickly enough for the purposes of this module. |
🏆 An important lesson here! 😄 In general, a lot of care needs to be made in properly setting up large aggregated multifile datasets. (Coordinates can be another source of trouble; I often call pydata/xarray#1832 has largely been resolved by recent improvements to dask and should probably be closed. It can be a red herring. |
@bradyrx, for testing purposes, can we use data in |
I just had a look through this code. Overall I like the approach, and I think this will be very useful. However, I wanted to comment that you seem to be reinventing lots of what xgcm is designed to do. Xgcm is meant to help work with finite volume data on Arakawa grids with a few simple operators. For a budget, see this example: Several people have used xgcm successfully with POP. It does require some modifications to the datasets however. Here is a gist which shows how that works: I know it would require some additional effort to refactor your code to use xgcm. However, in the long run I believe it could have major benefits. Ultimately our goal with xgcm is to be able to write the same (or at least very similar) analysis code against many different models. As long as effort is being made to develop a new "pop-tools" model, I think it could make sense to have xgcm as an upstream dependency. Otherwise, you are going to reinventing lots of wheels. I would be happy to consult on how to approach this if you're interested. |
vadv = _compute_vertical_advection(ds, grid, mask, kmax=kmax) | ||
lmix = _compute_lateral_mixing(ds, grid, mask, kmax=kmax) | ||
vmix = _compute_vertical_mixing(ds, grid, mask, kmax=kmax) | ||
reg_budget = xr.merge([ladv, vadv, lmix, vmix]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you were to use xgcm, you could replace the above with something like
grid = xgcm.Grid(grid_ds)
ladv = -grid.diff(adv_flux_x, 'X') - grid.diff(adv_flux_y, 'Y')
vadv = -grid.diff(adv_flux_z, 'Z')
etc.
@andersy005, this won't work unfortunately. We need a specific set of tracer flux variables, since this is a pretty specialized case here. To test all functions, we could use DIC. We'd need ['UE_DIC', 'VN_DIC', 'WT_DIC', 'HDIFN_DIC', 'HDIFE_DIC', 'HDIFB_DIC', 'KPP_SRC_DIC', 'DIA_IMPVF_DIC', 'FG_CO2', 'J_DIC', 'FvPER_DIC', 'FvICE_DIC']. One option would be for me to subset a small equatorial Pacific domain. Say -5S-5N, 100m depth, and some small span of longitude. That could bring the file size down pretty small and we could test on that file specifically. We'd also add 'tend_zint_100m_DIC' over the same domain. We could have an assert clause in the testing that our approximation is within some epsilon of the tendency term. |
@rabernat, thanks for taking the time to look through the code here and for the suggestion. This looks like it could be very useful and would smooth things out for future development on this module (e.g., global budget, overflow parameterizations, GM/submesoscale). I'll look through the examples you gave and branch off of this PR to see if I can make it work. I'll follow up if it seems promising and I hit any roadblocks. |
@matt-long @andersy005 , I'm going to branch off this PR to try out |
@bradyrx, sounds good. Could you point me to some dataset on glade that I can use to put together sample datasets for testing purposes? |
@andersy005, some single-simulation output is available here: /glade/p/cesm/community/CESM-DPLE/CESM-DPLE_POPCICEhindcast I'd say to start by grabbing just DIC, since if it works, all other BGC tracers should work. You can also build a T/S one once I implement the proper conversions there (still need to do this). Files needed:
To assert that budget is within epsilon of the saved out tendency:
I would just subset some box over the equatorial Pacific, and cut down to just 100m (since the tendency assertion is over 100m). You also probably only need a few years of output. |
@andersy005, I added an example notebook to be compiled by Here's a gist of it now for reference: https://nbviewer.jupyter.org/gist/bradyrx/d09ee61672e503721f20ef161ee98ade |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bradyrx, thank you for the dataset pointers. I will look into putting together a temporary dataset we can use for testing as we wait for a long term solution.
Can you update docs/source/api.rst
with an entry forregional_tracer_budget
?
ds = ds.rename({'z_w_bot': 'z_t'}) | ||
|
||
# Force datetime | ||
if not np.issubdtype(ds.time.dtype, np.datetime64): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bradyrx, we can extend this to support datasets with a time coordinate decoded using cftime._cftime.datetime
which extends to calendars other than the proleptic Gregorial calendar.
Xarray now supports cftime._cftime.datetime
based data types for groupby and resample operations to the same extent as for datetime
/numpy.datetime
based data types.
In [26]: ds.time
Out[26]:
<xarray.DataArray 'time' (time: 24)>
array([cftime.DatetimeNoLeap(1, 2, 1, 0, 0, 0, 0, 4, 32),
cftime.DatetimeNoLeap(1, 3, 1, 0, 0, 0, 0, 4, 60),
cftime.DatetimeNoLeap(1, 4, 1, 0, 0, 0, 0, 0, 91),
cftime.DatetimeNoLeap(1, 5, 1, 0, 0, 0, 0, 2, 121),
cftime.DatetimeNoLeap(1, 6, 1, 0, 0, 0, 0, 5, 152),
cftime.DatetimeNoLeap(1, 7, 1, 0, 0, 0, 0, 0, 182),
.........
cftime.DatetimeNoLeap(2, 9, 1, 0, 0, 0, 0, 0, 244),
cftime.DatetimeNoLeap(2, 10, 1, 0, 0, 0, 0, 2, 274),
cftime.DatetimeNoLeap(2, 11, 1, 0, 0, 0, 0, 5, 305),
cftime.DatetimeNoLeap(2, 12, 1, 0, 0, 0, 0, 0, 335),
cftime.DatetimeNoLeap(3, 1, 1, 0, 0, 0, 0, 3, 1)], dtype=object)
Coordinates:
* time (time) object 0001-02-01 00:00:00 ... 0003-01-01 00:00:00
Attributes:
long_name: model time
bounds: time_bounds
Cftime time values are stored in numpy array with dtype='object'.
In [27]: ds.time.dtype
Out[27]: dtype('O')
The only way to test for cftime class that I know of, consists of retrieving one value from the array :
In [28]: isinstance(ds.time.data[0], cftime._cftime.datetime)
Out[28]: True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from a conversation with @mclong, this will be better moved to a helper function to help users construct the dataset. I think this was only here to handle temporal resampling, which will be taken away in this PR.
Here is a nice notebook by @jbusecke using xgcm on CMIP6 models. This might be helpful for a future PR to convert this over to xgcm in coordination with #21. For now it's probably best for me to incorporate suggestions and get this working with some test data, then convert to |
EDIT: Sorry I have not grasped the whole story of this issue with an email.
We are planning to get curl/div/grad as operators into xgcm.
Please raise an issue over as xgcm and lets discuss there.
I am planning to generalize this so we can use it as a package!
Where is the best way to discuss this centrally?
Julius J.M. Busecke, Ph.D. (he/him)
———————————————
Postdoctoral Research Associate
Princeton University • Geosciences
408A Guyot Hall • Princeton NJ
jbusecke.github.io<http://jbusecke.github.io>
On Oct 16, 2019, at 11:30 AM, Riley Brady <[email protected]<mailto:[email protected]>> wrote:
Here is a nice notebook by @jbusecke<https://github.com/jbusecke> using xgcm on CMIP6 models. This might be helpful for a future PR to convert this over to xgcm in coordination with #21<#21>.
https://nbviewer.jupyter.org/github/jbusecke/cmip6-hackathon-demos/blob/master/notebooks/xgcm_exmples.ipynb
For now it's probably best for me to incorporate suggestions and get this working with some test data, then convert to xgcm compatible and ensure that the results are the same.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#12?email_source=notifications&email_token=ADNGY77JQTJAFJCXBC5OOTDQO4XQNA5CNFSM4HST26A2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBM5QHI#issuecomment-542758941>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADNGY74JQ7ZJASCYZTVHH53QO4XQNANCNFSM4HST26AQ>.
|
Compute kmax, deduce if using full depth. Convert all units to tendencies only load subset of variables
add vertical advection add lateral mixing add vertical mixing add SMS
make REGION_MASK default
e2f7133
to
301e9a3
Compare
@dcherian, @klindsay28, @mnlevy1981 ignore the auto-generated review request for now. Just rebased to master since this is very old. Going to go through the comments and revise this and prepare for merging. |
This will probably be a lot easier if we use xgcm with metrics so maybe tackle #40 first? |
@dcherian , I'll dig into that. I figured I'd open a new PR and switch to |
Closing this for now, since it's been polluting the PR list for awhile. Will be taking a look at #40 as a path forward, and will address comments here in putting together a new budget PR. |
Per discussions with @matt-long, this is a first implementation of a tracer budget on the POP grid. This first PR just computes a regional budget. I.e., it expects the user to input some POP horizontal mask and some depth level (or full depth) to calculate the tracer budget over.
I'm looking for a first pass code review here with any technique/optimization suggestions. I'll generalize this for all tracers (DIC, DIC_ALT_CO2, DOC, Fe, O2, temperature, salinity) while implementing suggested changes for this PR.
Reviews/comments would be helpful from @matt-long, @andersy005, @sridge and any others.