Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rudimentary support for geopandas geoseries #5

Merged
merged 12 commits into from
Sep 5, 2022
Merged

Rudimentary support for geopandas geoseries #5

merged 12 commits into from
Sep 5, 2022

Conversation

nmandery
Copy link
Owner

Fixes #4

@metasim
Copy link

metasim commented Jul 28, 2022

This is not at all qualified for public use, but it was a quick (and ugly) hack to support either Series (has geometry) or a whole GeoDataFrame (has feature collections), basically relying on the FromPyObject derive machinery to do the heavy lifting. In my use case I wanted the direct Geometry<f64> results directly. Maybe there's a useful idea to steal here... maybe not.

pub(crate) trait GeoCollect {
    fn collect_geom(&self) -> Result<Vec<Geometry<f64>>>;
}

#[derive(FromPyObject, Debug)]
struct Feat {
    #[pyo3(item)]
    geometry: GeoInterface,
}

#[derive(FromPyObject, Debug)]
struct FeatColl {
    #[pyo3(item)]
    features: Vec<Feat>,
}

impl GeoCollect for FeatColl {
    fn collect_geom(&self) -> Result<Vec<Geometry<f64>>> {
        Ok(self.features.iter().map(|f| f.geometry.0.to_owned()).collect())
    }
}

#[derive(FromPyObject, Debug)]
struct GeoProto {
    __geo_interface__: FeatColl,
}

impl GeoCollect for GeoProto {
    fn collect_geom(&self) -> Result<Vec<Geometry<f64>>> {
        self.__geo_interface__.collect_geom()
    }
}

#[derive(FromPyObject, Debug)]
enum DynamicGeom {
    Proto(GeoProto),
    FC(FeatColl),
}

impl GeoCollect for DynamicGeom {
    fn collect_geom(&self) -> Result<Vec<Geometry<f64>>> {
        match self {
            DynamicGeom::Proto(coll) => coll.collect_geom(),
            DynamicGeom::FC(coll) => coll.collect_geom()
        }
    }
}

#[derive(Debug)]
struct PyGeoms(pub Vec<Geometry<f64>>);

impl FromPyObject<'_> for PyGeoms {
    fn extract(obj: &'_ PyAny) -> PyResult<Self> {
        let geo: DynamicGeom = obj.extract()?;
        let geo = geo.collect_geom();
        geo.map(PyGeoms).map_err(Into::into)
    }
}

#[derive(FromPyObject, Debug)]
/// NewType for Python object assumed to be a GeoPandas DataFrame or Series.
pub(crate) struct GeoDataFrame(PyObject);

impl GeoCollect for GeoDataFrame {
    fn collect_geom(&self) -> Result<Vec<Geometry<f64>>> {
        let g = Python::with_gil(|py| {
            self.0.extract::<PyGeoms>(py)
        });
        g.map(|p| p.0).map_err(Into::into)
    }
}

src/lib.rs Outdated
@@ -72,6 +72,9 @@ pub mod from_py;
pub mod to_py;
pub mod wrappers;

#[cfg(feature = "f64")]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you consider this being a different feature flag?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used the f64 feature as usescase of geopandas use the f64 coordinates anyways. The second thought was that this implementation would not cause much compilation overhead anyways, so I omitted creating an own feature for it.

@nmandery
Copy link
Owner Author

Nice - looks good 👍

I also removed the wrapping into GeoInterface for the individual geometries later, but have not yet pushed/commited that.

Would you thing it would be helpful for users to have the Feature and FeatureCollection wrappers available in this library? I would think it would help to get started. Especially as the wording in the Readme is not really clear that in reality only geometries are supported currently and not the complete geo-interface spec.

@metasim
Copy link

metasim commented Jul 29, 2022

Would you thing it would be helpful for users to have the Feature and FeatureCollection wrappers available in this library? I would think it would help to get started. Especially as the wording in the Readme is not really clear that in reality only geometries are supported currently and not the complete geo-interface spec.

I think something like them might be helpful, but if it were me I'd want to be careful in indicating it's not true Feature/FeatureCollection support (unless you want to do that). Dealing with Properties, CRS, etc. seems like a bigger scope effort, especially since geo crate doesn't have types already defined for them, and defining them yourself opens the door to impedance/conversion issues in the larger ecosystem (like geo type vs. gdal vector types).

However, the geojson crate does have Feature/FeatureCollection, so perhaps relying on those via an optional feature would be a compromise?

@metasim
Copy link

metasim commented Jul 29, 2022

PS: Have you considered getting your crate listed here: https://georust.org/ ? I think it deserves to be 😄

@metasim
Copy link

metasim commented Jul 29, 2022

I forked the project and was going to poke at it myself, but it doesn't look like you use maturin for builds, and I'm struggling to get cargo test to run (link error: Library not loaded: @rpath/libpython3.10.dylib). I looked at the ci.yaml file and there's no magic there... do you do anything special in your environment to be able to run the tests?

@nmandery
Copy link
Owner Author

Thats a good point. When I created this I had only passing single geometries between the two languages in the scope as I only needed that. I should make that clear in the readme. Supporting the whole spec is quite a bit more - did not remember that CRS is also part of this.

Using geojson maybe complicates things, instead of converting between PyAny and rust types, it would require dealing with the JSON Value types. In the end its quite hard to provide a single common implementation for Feature, as everybody has its own struct types where the members need to be exposed as properties. Probably an example how to do this would be helpful - I will look into that.

Regarding submitting this project to georust: My thought was waiting a bit if the project gets any usage in the community to see if it is something of general interest. Glad you like it.

Doesn`t maturin only help when building python extensions? I do not have any special setup in my environment, maybe what you are encountering is Mac-specific, CI and myself are using Linux. Something similar - but without solution - is here. Maybe there is a better way to use pyo3 from inside the unittests than what I am doing. At least on Mac the test binaries do not seem to be linked as expected.

@nmandery
Copy link
Owner Author

With #6 the tests hopefully work now when running using cargo test --feature test

@metasim
Copy link

metasim commented Jul 31, 2022

maybe what you are encountering is Mac-specific,

Yep... pretty sure that's it. Some combination of that, and trying to use a sandboxed python environment (conda in my case). Looks suspiciously similar to what you referenced as well as this open issue :/. In my other code I am writing an extension, so libpython... is already loaded by the time pyo3 code is called, so this is the first time I've run into it. I'm sure I can figure out a workaround.

@metasim
Copy link

metasim commented Jul 31, 2022

Using geojson maybe complicates things, instead of converting between PyAny and rust types, it would require dealing with the JSON Value types.

I'm not sure that is the case here, as the georust/geojson crate has stand-alone structs for Feature and FeatureCollection. You would end up linking in more code than one might want (the JSON value stuff you referenced, etc.), but I do like the author's implementations as they look well informed by the specification. At least it's informative to how one's own implementation might look.

@metasim
Copy link

metasim commented Jul 31, 2022

did not remember that CRS is also part of this.

I probably overstated the complexity implied by that part of the support. it's basically just a (oddly formatted) string in GeoJson with weak support in most tooling anyway.

@metasim
Copy link

metasim commented Jul 31, 2022

FWIW, I have tests running after adding export DYLD_LIBRARY_PATH=/opt/homebrew/Caskroom/miniconda/base/envs/py_geo_interface/lib. I didn't realize until now that conda doesn't do that for you automatically (but also makes sense why not).

@nmandery
Copy link
Owner Author

nmandery commented Aug 2, 2022

Glad you got the tests to work. Such linking issues are always annoying.

I think I want to avoid more dependencies for now if possible - geojson covers the spec nicely, as you said. I see the strength of the python integration to also to work outside of what geointerface specifies. Like the wkb feature which provides an additional path for data exchange between rust and python using shapely for exchanging geometries using WKB.

All other parts of the spec can be added when the need arises.

@@ -2,6 +2,11 @@

## Unreleased

* Rename `GeometryInterface` struct to `Geometry` - that is the last rename of this struct - promised.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its is impossible to pack everything into a name ;)

@nmandery nmandery merged commit 1ec8784 into main Sep 5, 2022
@nmandery nmandery deleted the geopandas branch September 5, 2022 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to use with GeoPandas?
2 participants