-
Notifications
You must be signed in to change notification settings - Fork 21
Finding examples of openly accessible images #3
Comments
I'm not sure what you mean about this image having only two bands; it has 11 bands if you count panchromatic: |
Oh, I think I see the confusion: The information from the different bands is in different files. You need all the files, if you want all the bands. So I don't think we actually need a new example here, you just have to read all the different bands in, from different files. |
As the current example does have sufficient bands (across the relevant files) I've renamed this issue to reflect that we would still like a variety of good, publicly accessible image examples regardless. |
The jpeg formatted images are not suitable to calculate things like
NDVI (vegetation). There are a couple of sites to download the images
from, like EarthExployer and Glovis, but the how and why of that I hope
is beyond the scope of the examples other than providing a suitable
download location. I went on to downloaded the same two LANDSAT images
from NASA's EarthExployer and have started a notebook with the intent to
go through the entire exercise step-by-step (which include things like
stacking the individual bands, generating masks from NoData and
non-overlapping images, calculating NDVI, visualizing, etc.). The
images are 7961x7241 and 7821x7941 for the 1988/10/22 and 2017/10/22
images respectively. These images also do not exactly overlap, so they
have to be clipped, etc., if you are going to use the conventional "open
as an xarray" approach. This is a very common workflow for people doing
geospatial analysis, and it is just what it is. While we could slice
them down as Jean-Luc suggests, it sidesteps a lot of issues that users
have to deal with and hides them. THis has contributed to my own
confusion and difficulty learning your tools -- because there is enough
differences between xarray, dask, rasterio, and numpy that non of my
intuition on how they work together actually holds.
One other problem I stumbled on to is the ordering of the data -- is
this (bands, height, width) or (height, width, bands) or (height, bands,
width)... I think it is rasterio that has a helper function which
reorders the bands from the stacked 2D images to 3D image ordering.
Thinking about this a little I realized that I rarely have control over
how the data come to me, and this would suggest that a general
reordering/mapping method may be in order.
…On Apr 25 2018 8:27 AM, Jean-Luc Stevens wrote:
The original version of
[landsat_spectral_clustering.ipynb](https://github.com/pyviz-topics/EarthML/blob/master/examples/landsat_spectral_clustering.ipynb)
used ``redding.tif`` which was obtained originally from
[planet.com](https://www.planet.com/docs/api-quickstart-examples/step-2-download/).
As I wasn't sure whether this image could be made available, I
updated the notebook to use a landsat example taken from a
[datashader example](http://datashader.org/topics/landsat.html).
@ebo then informed me by e-mail that this image is not really
suitable as it only has two bands. We would like some example images
that can be made public that also have a decent number of bands. This
is important as we will then be able to compute the various indices
that we might then want to learn on.
One suggestion by @ebo was to use [these images of a disappearing
lake](https://earthobservatory.nasa.gov/IOTD/view.php?id=91921)
although I only see a link to download them in JPG format?
Lastly, a [new
notebook](1545c61)
has been committed referencing 'Midwest_Mosaic.tif' which I don't
think we have discussed yet. Is this something we could slice down
and
add to the repo as an example?
|
On Apr 25 2018 8:33 AM, James A. Bednar wrote:
I'm not sure what you mean about this image having only two bands; it
has 11 bands if you count panchromatic:
http://datashader.org/topics/landsat.html
that one might, but the image
<https://github.com/pyviz-topics/EarthML/blob/master/examples/landsat-sample.tiff>
added to pyviz-topics/EarthML repository only has two:
=================
# gdalinfo ~/Downloads/landsat-sample.tiff
Driver: GTiff/GeoTIFF
Files: /home/jldavid3/Downloads/landsat-sample.tiff
Size is 2500, 2500
Coordinate System is `'
Metadata:
TIFFTAG_DOCUMENTNAME=/Users/jstevens/Desktop/development/EarthML/examples/cropped.tiff
TIFFTAG_IMAGEDESCRIPTION=Created with GIMP
TIFFTAG_RESOLUTIONUNIT=2 (pixels/inch)
TIFFTAG_XRESOLUTION=72
TIFFTAG_YRESOLUTION=72
Image Structure Metadata:
COMPRESSION=LZW
INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left ( 0.0, 0.0)
Lower Left ( 0.0, 2500.0)
Upper Right ( 2500.0, 0.0)
Lower Right ( 2500.0, 2500.0)
Center ( 1250.0, 1250.0)
Band 1 Block=2500x64 Type=Byte, ColorInterp=Gray
Mask Flags: PER_DATASET ALPHA
Band 2 Block=2500x64 Type=Byte, ColorInterp=Alpha
|
I wonder if that is what is responsible. It seemed to be the quickest way to slice the image but I guess it might not have preserved the bands. |
Right. @jlstevens, please rename the example file to indicate which original image you started with; the filename indicates which bands it was. And then we could consider stacking the various images into a single merged image, but the separate images are how the files were provided from LANDSAT. |
On Apr 25 2018 8:40 AM, James A. Bednar wrote:
Oh, I think I see the confusion: The information from the different
bands is in different files. You need all the files, if you want all
the bands. So I don't think we actually need a new example here, you
just have to read all the different bands in, from different files.
Why I was advocating a different image/example is that if you are doing
a change detection you have to compare two images which are not
guaranteed to exactly overlap, have different spectral characteristics
(LANDSAT-5 vs 8), and other such geospatial issues. Whatever we choose,
if we pick an example and use it again and again, then we can leverage
it for a half-dozen lessons. The 90% loss of Walker Lake water volume
is not only timely, but people people care about the background story.
If it is used for a tutorial on end-to-end how you do this stuff, then
we can demonstrate what actually happens when you work with these images
for real.
|
I agree that if we can tell a compelling story with the data we have then we should. |
Sounds good. It will be great to have additional examples showing other topics, and it will be great to keep the overall number of data files that people have to download and that we have to document low. But it's not the number of bands that would invalidate this example (as in Jean-Luc's original title), just that some other example may subsume it for other reasons. When that happens, great! |
On Apr 25 2018 9:18 AM, Jean-Luc Stevens wrote:
> TIFFTAG_IMAGEDESCRIPTION=Created with GIMP
I wonder if that is what is responsible. It seemed to be the quickest
way to slice the image but I guess it might not have preserved the
bands.
While that is the offending tool, the real problem is that it was
processed in a way that you lost most or all projection information, and
covnerted a multispectral into a gray-scale (panchromatic) image.
|
There are some high-resolution CC BY-SA-licensed examples at https://info.planet.com/download-free-high-resolution-skysat-image-samples/, though none of them cover the same region of the earth at different times that I can see. |
In case it helps, the original files from the current example are available from: https://github.com/bokeh/datashader/blob/master/examples/datasets.yml#L36 |
On Apr 25 2018 9:22 AM, Jean-Luc Stevens wrote:
I agree that if we can tell a compelling story with the data we have
then we should.
I'm working on that... Not only was the study highlighted by NASA's
Earth Observatory
<https://earthobservatory.nasa.gov/IOTD/view.php?id=91921>, the study
was also published in Nature Geosciences
<https://www.nature.com/articles/ngeo3052>. That is a huge deal.
What I am proposing is to replace the original example image with these
two, unless there is a compelling reason or similar compelling story
about that image. There may be and I had just missed it...
Oh, another interesting thing is that we might be able to get access to
the salinity data to replicate the graphs
<https://www.nature.com/articles/ngeo3052/figures/2> in the Nature
paper.
|
Sounds good! No, there wasn't a compelling story about the original image, though there was intended to be. :-) It was originally chosen to try to show the differences between the actual coastline of Southern Louisiana around New Orleans and what standard maps show as the outline of Louisiana, but the wrong image was selected, and so it ended up being the wrong bit of coastline, not telling that story at all. So there's no problem replacing it with something that has a better story. |
Can I ask you to expand on this a bit further? I suspect that Numpy and XArray both have mechanisms to help you here. In particular you might want to look into parts of their respective APIs that include |
On Apr 25 2018 9:35 AM, James A. Bednar wrote:
Sounds good! No, there wasn't a compelling story about the original
image, though there was intended to be. :-) It was originally chosen
to try to show the differences between the actual coastline of
Southern Louisiana around New Orleans and what standard maps show as
the outline of Louisiana, but the wrong image was selected, and so it
ended up being the wrong bit of coastline, not telling that story at
all. So there's no problem replacing it with something that has a
better story.
I would have to check on some things, but my wife is a senior research
ecologist with the USGS, and has permanently monitored sites in the
cypress swamps. The water extraction in Texas is causing the cypress
swamps to die in the area around Beaumont TX. If we had an example that
focused on that area I may be able to get you some really compelling
stories (I hesitate to call them good stories). I would have to check
if that would cause a conflict of interest, etc. She also has sites in
Jean Lafitte National Park, which is around where you were talking
about, and we can engage quite a number of folks there as well.
|
On Apr 25 2018 9:49 AM, Matthew Rocklin wrote:
> One other problem I stumbled on to is the ordering of the data -- is
> this (bands, height, width) or (height, width, bands) or (height,
> bands,
> width)... I think it is rasterio that has a helper function which
> reorders the bands from the stacked 2D images to 3D image ordering.
> Thinking about this a little I realized that I rarely have control
> over
> how the data come to me, and this would suggest that a general
> reordering/mapping method may be in order.
Can I ask you to expand on this a bit further? I suspect that Numpy
and XArray both have mechanisms to help you here. In particular you
might want to look into parts of their respective APIs that include
`stack` and `transpose` functions depending on what you want.
In the last example I sent to Jean-Luc I used transpose and stack. I
am not sure I did it right, but yes I am marginally aware of the base
functionality.
What I had meant by the mapping function is, it is intuitive what
(band, y, x) means, as well as (x, y, z) in 3D data. Either documenting
a couple of examples using stack and transpose to show how it is done,
or developing a helper function that does an intuitive mapping, would be
useful I think. There is a precedent with rasterio's reshape_as_image
which will take an image stack (3, 718, 791) and remap it to (718, 791,
3) or in my example above (orient={'bands':'z'}) or some such. (see
https://github.com/mapbox/rasterio/blob/master/docs/topics/image_processing.rst)
|
BTW, the author of the Nature Geosciences article sent me the data to replicate several of the published graphs. I have a number of other things on my plate at the moment, but if we extend the walker lake to either include the Great Salt Lake, or as a second notebook. We should be able to replicate the work in https://www.fs.fed.us/rm/pubs_journals/2017/rmrs_2017_wurtsbaugh_w001.pdf. Also, I have permission to publicly release the data with the agreement to properly cite and credit the work. I would also want to have them review this before formally releasing it if possible to make sure I/we do not make a mistake that would offend. |
A recently announced source of freely available hi-res imagery: |
The original version of landsat_spectral_clustering.ipynb used
redding.tif
which was obtained originally from planet.com. As I wasn't sure whether this image could be made available, I updated the notebook to use a landsat example taken from a datashader example.@ebo then informed me by e-mail that this image is not really suitable as it only has two bands. We would like some example images that can be made public that also have a decent number of bands. This is important as we will then be able to compute the various indices that we might then want to learn on.
One suggestion by @ebo was to use these images of a disappearing lake although I only see a link to download them in JPG format?
Lastly, a new notebook has been committed referencing 'Midwest_Mosaic.tif' which I don't think we have discussed yet. Is this something we could slice down and add to the repo as an example?
The text was updated successfully, but these errors were encountered: