Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

spatial and/temporal subset problem #693

Closed
Evadzi opened this issue Jun 26, 2018 · 5 comments
Closed

spatial and/temporal subset problem #693

Evadzi opened this issue Jun 26, 2018 · 5 comments
Assignees

Comments

@Evadzi
Copy link

Evadzi commented Jun 26, 2018

What I expected Cate to do.

For most ESA datasets, it is possible to subset ingested data(at least for my past experiments). I tried to call monthly cloud(esacci-l......AASTR2) and aerosol (esacci_AEROSOLS_mon_L3C.AER_PRODUCTS_AATSR_Envisat_SU_04_21_R1.nc) data from the open data portal for analysis((subset:spatial and temporal), co-registration and correlation analysis). Cate is expected to display data correctly for the subset region.

What Cate does instead or doesn't do at all(3MAIN ISSUES).

  1. subset cloud data is displayed for wrong spatial region(see Fig A1-check the lon, lat values). This was also the case for the aerosol.

figa1 spatial_subset

  1. I suspected the issue was related to reference to lon. 0 deg. and I changed the negatives location(shouldn't be the case) which rather covered the region. this however produces the error in display(1) when I performed temporal subset on this spatially adjusted type of subset.

  2. I later reduced the lat. values further to -140,-80, 140,80 for the cloud and aerosol data, which surprisingly was displayed correctly and I continued to perform my co-registration and correlation analysis.
    I saved my workspace and reopened it;AS EXPECTED:::DATA DISPLAYED AT WRONG LOCATION(see Fig A2)..

fig a2 subset_changed_after_restart

Steps to reproduce the problem(see my python script below)..

Step 1

CFC = cate.ops.open_dataset(ds_id="local.cloud.mon")

Step 2

aerosol = cate.ops.open_dataset(ds_id="local.Aerosol.mon")

Step 3

AAOD = cate.ops.select_var(ds=aerosol, var="AAOD550_mean")

Step 4

cfc_temp_subset = cate.ops.subset_temporal(ds=CFC, time_range="2004-12-01,2010-11-01")

Step 5

aaod_temp_subset = cate.ops.subset_temporal(ds=AAOD, time_range="2004-12-16,2010-11-16")

Step 6

cfc_spa_temp_subset = cate.ops.subset_spatial(ds=cfc_temp_subset, region="-140,-80,140,80")

Step 7

aaod_spa_temp_subset = cate.ops.subset_spatial(ds=aaod_temp_subset, region="-140,-80,140,80")

Step 8

cfc_coregis_spa_temp_subset = cate.ops.coregister(ds_master=aaod_spa_temp_subset, ds_slave=cfc_spa_temp_subset, method_us="nearest")

Step 9

cfc_aaod_correlation = cate.ops.pearson_correlation(ds_x=cfc_coregis_spa_temp_subset, ds_y=aaod_spa_temp_subset, var_x="cfc", var_y="AAOD550_mean")

Specifications
Cate Desktop 2.0.0-dev 15

@JanisGailis
Copy link
Member

The issue here is the following - we have decided at some point to use shapely polygons for all operations that take a region in, such that we can also work with, say, country outlines. So, we have a PolygonLike helper class that converts everything it can (WKT polygons, lat/lon pairs, etc) to a shapely polygon. In this process we inherently loose information regarding which lon value was min and which one was max. So, we loose information if the polygon was expected to cross the anti meridian (-180/180) or not. There are some heuristics in place to determine this. In general, for simple 'box polygons' the last line of defence is to select the smallest of the two possible rectangles. Because, who would ever try to select a very large area that crosses the anti meridian? :)

A workaround is to use a 'hand-drawn' polygon instead of lat/lon values. This should 'almost always' work as expected.

If this is a common use case for some reason, we can try to fix this somehow, but it will be ugly. E.g., creating a separate subset operation for such cases, or adding a boolean switch that lets the user say explicitly whether the intended output should cross the anti-meridian.

@JanisGailis JanisGailis self-assigned this Jun 26, 2018
@JanisGailis
Copy link
Member

Wait, no, scratch the above, I had forgotten we should now treat explicit coordinates correctly.

@JanisGailis
Copy link
Member

JanisGailis commented Jul 16, 2018

I think I've pinpointed the problem. Doing open, subset temporal, subset spatial in a clean workspace yields the expected result:
image
However, when the workspace is saved, the region for the spatial_subset step becomes a WKT polygon instead of the original lon1, lat1, lon2, lat2 tuple, resulting in a garbled output:
image
Which becomes a clean subset after normalization, but the opposite of what was originally required:
image

This, by the way is the expected output for the given polygon: POLYGON ((140 -80, 140 80, -140 80, -140 -80, 140 -80))

This should be fixed by making sure the original tuple is saved as the input of that operation.

@JanisGailis
Copy link
Member

@Evadzi This should be fixed on master now.

@JanisGailis
Copy link
Member

Reopen if persists!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants