-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn Download_LIDAR_data into a parameterized script #4
Comments
Hi Scott, I think Karthik's script is the latest and greatest we have which we should work on for further improvements :) @kvenkman please upload your python script to internal repo, and then we can take a stab. I am getting an account on MAAP, NASA's internal planetary computer. Making the 3DEP DSM there using the pdal pipeline is going to be one of my first tests of the platform 😊 I'll keep you all posted! |
The projection info in the EPT file is an issue. For the Mt. Baker site, we did not have to specify the NAVD88 geoid in the a_srs vertical projection info, and the output DEM had no geoid offset with respect to WGS84 ellipsoid products from ASP. |
Finally, from David's slack message, which I agree with:
|
We need to regroup on this, centralize, perform a detailed review, and define updated requirements. I'm not sure this should live in the DeepDEM repo. Moving it elsewhere eliminates need for pdal dependency. It could end up in coincident repo, as we will need to prepare 3DEP (and other) lidar data. Or it could be a standalone repo. The output lidar DSM products are needed by Shashank's co-registration scripts, to prepare aligned inputs for DeepDEM. |
I don't understand why points classified as roads are removed: https://github.com/uw-cryo/stv-aux-data/blob/main/pdal_pipeline.py#L117. This may be specific to Mt. Baker 3DEP data, and should be removed from the general processing pipeline. |
I have some concerns about the percentile filtering implementation and default use of idw gridding. I believe this is the cause of the unusual/unexpected offsets between the lidar "DSM" and other DSM sources over vegetation (@ShashankBice @Jack-Hayes). I see notes about filtering noisy Z values: https://github.com/uw-cryo/stv-aux-data/blob/main/pdal_pipeline.py#L252C3-L252C81. That is not what this is supposed to do. The objective of this percentile filter option was to allow users to create a gridded raster with output values using a relative height percentile (like RH98, where output value is the 98th percentile of the distribution for all height values for points within some radius of that output grid cell). This is a workaround, as the core functionality is not currently available in PDAL: PDAL/PDAL#1596. @kvenkman , the workaround we discussed was to remove the upper X% of point height values (not necessarily noise, these could be real returns from highest part of canopy within the cell), and then use If we are creating a DTM, and filters have isolated ground returns, then Hopefully that makes sense. This is critical and should be fixed ASAP, as it is affecting multiple projects. |
@dshean I wrote in the option to filter out road classifications as in some scenes they seemed to contribute to noise. We can switch it off for scenes where unnecessary - or remove the code altogether if its extraneous. For the current iteration of DeepDEM experiments, the percentile filter has been switched OFF (running tests to double-check), so we have been using RH100. We can extend the notebook to have the option to specify gridding methods as well. Lastly, I agree with the general consensus that the code to download, process and generate DSMs for various projects should not live in the DeepDEM repository - I wrote up this notebook (borrowing heavily from the OT version) to serve only as a starting point for folks who jump directly to using the repo, since there was no code for this purpose. Ideally this code would live/be maintained in another repo, and referenced / periodically be updated here. |
OK, but we don't want to use RH100. We want to use RH98, which can be accomplished using the PDAL percentile filter workaround and |
If you switched the percentile filter OFF, but are still using idw, then you are not creating an output DSM with RH100, and the values will be biased low over vegetation or any areas with multiple returns. Hopefully that is clear. |
Understood - the ground returns would be biasing the DSM low when using idw. I'll rerun the DSM processing |
OK. Can you please confirm what you mean by "rerun the DSM processing"? Want to make sure we are on the same page around the filters and parameters to use, so we don't have to do this again. It's not just the ground returns. The number of ground returns is relatively low. Most of the valid returns are coming from within the canopy. We want to limit our DSM values to our best estimate for the top of canopy returns. I appreciate centralization, and a notebook with single function and flexible options, but we really just need need separate pipelines for these products (DSM, DTM, intensity raster). We need to isolate 'first,only' returns and use idw for DSM creation. Or if using all returns, use a metric like RH98 and max. I've shared this in the past, but here it is again https://github.com/uw-cryo/EarthLab_AirQuality_UAV/blob/d9b37831b6b6b3aab4cbd476735fdf55b0379560/notebooks/EarthLab_AQ_lidar_download_processing_function.ipynb: |
Not a priority at the moment, but for everyone to better understand these issues, it would be valuable to run some tests for small bboxes over different vegetation types at PCD sites using different combinations of return filters, class filters, percentile filters and gridding methods. Analyzing differences between these outputs will illustrate the importance of these processing choices. Can then compare the PDAL output with the vendor-provided DTM products, and when available, the vendor-provided DSM products. For example, see observed differences in Cells 57 and 58 here: https://github.com/uw-cryo/EarthLab_AirQuality_UAV/blob/d9b37831b6b6b3aab4cbd476735fdf55b0379560/notebooks/EarthLab_AQ_lidar_analysis_viz_old.ipynb. |
Re. reprocessing the DSM, I was going to turn ON the percentile filter with a threshold of 98%, and set the gridding method to 'max'. Does that sound right? |
You can try that approach (remove upper 2% of points, then compute max), but I also think the more traditional ALS processing approach (isolating 'first,only' returns, then idw) should be implemented and tested. How much the two differ will depend on a number of factors, including point density, grid cell size, vegetation characteristics. |
After extensive testing over the Mt Baker site, I can confirm that using only the first returns along with idw gridding yields a DSM that is on average 10 m higher over dense vegetation, with very little change over bare terrain. The previous DSM I had generated was gridded using idw but considered all returns. I have reprocessed the DSM for the entire site and will be re-running training of the models tomorrow. |
Thanks everyone for all the clarifications and workflow refinement! Revisiting the initial focus of this thread, I think there is consensus to move the LiDAR processing into it's own repo, that will greatly simplify environment management for the PDAL dependency (#14), and also we can have tighter scope around each piece of software. If it's to be a standalone installable package we should have a short and sweet unique name (not one of these https://pypi.org/search/?q=lidar), and be sure to clarify the focus of the workflows compared to other existing tools (e.g. https://github.com/opentopography). @dshean, @kvenkman, or @ShashankBice if one of you could create a new repo in this org with a suggested name I'll start migrating the latest code into script form. |
Here is a repo. Feel free to experiment with the different functions and parameters, tweak names etc 😊 |
In order to generate DeepDEM products for a variety of STV study sites it would be great to turn the notebooks into separate parameterized scripts, starting with this one: https://github.com/uw-cryo/DeepDEM/blob/main/notebooks/0a_Download_LIDAR_data.ipynb
Took a quick stab at this here, https://github.com/uw-cryo/stv-aux-data/blob/main/pdal_pipeline.py, but I think it should live in the repo under the scripts folder.
Would be good to also document practical considerations, such as recommended computational resources for a given AOI size and result posting (e.g. 10m vs 1m).
The text was updated successfully, but these errors were encountered: