diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7bdcf5b78..b44e6f95d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- ContextualCountEmbedder
- (CI) Changelog Enforcer
- Utility plotting module based on Folium and Plotly
+- Project README
- Documentation for srai library
- Citation information
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 753d5b8e8..53323b66a 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -6,8 +6,6 @@
## Contributing to the code base
-### What belongs in srai?
-
### Getting started
To make changes to srai's code base, you need to fork and then clone the GitHub repository.
@@ -18,40 +16,40 @@ For first setup of the project locally, the following commands have to be execut
1. Install [PDM](https://pdm.fming.dev/latest) (only if not already installed)
-```sh
-pip install pdm
-```
+ ```sh
+ pip install pdm
+ ```
2. Install package locally (will download all dev packages and create a local venv)
-```sh
-# Optional if you want to create venv in a specific version. More info: https://pdm.fming.dev/latest/usage/venv/#create-a-virtualenv-yourself
-pdm venv create 3.8 # or any higher version of Python
+ ```sh
+ # Optional if you want to create venv in a specific version. More info: https://pdm.fming.dev/latest/usage/venv/#create-a-virtualenv-yourself
+ pdm venv create 3.8 # or any higher version of Python
-pdm install -G:all
-```
+ pdm install -G:all
+ ```
3. Activate pdm venv
-```sh
-eval $(pdm venv activate)
+ ```sh
+ eval $(pdm venv activate)
-# or
+ # or
-source ./venv/bin/activate
-```
+ source ./venv/bin/activate
+ ```
4. Activate [pre-commit](https://pre-commit.com/) hooks
-```sh
-pre-commit install && pre-commit install -t commit-msg
-```
+ ```sh
+ pre-commit install && pre-commit install -t commit-msg
+ ```
### Testing
For testing, [tox](https://tox.wiki/en/latest/) is used to allow testing on multiple Python versions.
-To test code locally before committing, run
+To test code locally before committing, run:
```sh
tox -e python3.8 # put your python version here
@@ -60,14 +58,19 @@ tox -e python3.8 # put your python version here
### Documentation
- This repository uses [MkDocs](https://www.mkdocs.org) as a documentation generator. To use it locally, run `pdm install -G docs` to download all required packages.
- Docstrings should be written following the [google convention](https://gist.github.com/redlotus/3bc387c2591e3e908c9b63b97b11d24e). To ease development one can use [autoDocstring extension](https://marketplace.visualstudio.com/items?itemName=njpwerner.autodocstring) to generate the docstrings. -->
+This repository uses [MkDocs](https://www.mkdocs.org) as a documentation generator. To build and serve the documentation locally, run:
+
+```bash
+mkdocs serve
+```
+
+Docstrings should be written following the [google convention](https://gist.github.com/redlotus/3bc387c2591e3e908c9b63b97b11d24e). To ease development one can use [autoDocstring extension](https://marketplace.visualstudio.com/items?itemName=njpwerner.autodocstring) to generate the docstrings.
-### Fixing bugs
+
### Python conventions
All Python code must be written **compatible with Python 3.8+**.
-
## Deployment
+
### Releasing a new version
+
To release a new version:
+
```sh
bumpver update --patch
```
+
This command will update the version strings across the project, commit and tag the commit with the new version. All you need to do is to push the changes.
diff --git a/README.md b/README.md
index a24c9657f..8bbc11377 100644
--- a/README.md
+++ b/README.md
@@ -26,4 +26,263 @@
-Spatial Representations for Artificial Intelligence
+# Spatial Representations for Artificial Intelligence
+
+⚠️🚧 This library is under HEAVY development. Expect breaking changes between `minor` versions 🚧⚠️
+
+💬 Feel free to open an issue if you find anything confusing or not working 🗨️
+
+Project **Spatial Representations for Artificial Intelligence** (`srai`) aims to provide simple and efficient solutions to geospatial problems that are accessible to everybody and reusable in various contexts where geospatial data can be used. It is a Python module integrating many geo-related algorithms in a single package with unified API. Please see getting starded for installation and quick srart instructions.
+
+## Use cases
+
+In the current state, `srai` provides the following functionalities:
+
+* **OSM data download** - downloading OpenStreetMap data for a given area using different sources
+* **OSM data processing** - processing OSM data to extract useful information (e.g. road network, buildings, POIs, etc.)
+* **GTFS processing** - extracting features from GTFS data
+* **Regionization** - splitting a given area into smaller regions using different algorithms (e.g. Uber's H3[1], Voronoi, etc.)
+* **Embedding** - embedding regions into a vector space based on different spatial features, and using different algorithms (eg. hex2vec[2], etc.)
+* Utilities for spatial data visualization and processing
+
+For future releases, we plan to add more functionalities, such as:
+
+* **Pre-computed embeddings** - pre-computed embeddings for different regions and different embedding algorithms
+* **Full pipelines** - full pipelines for different embedding approaches, pre-configured from `srai` components
+* **Image data download and processing** - downloading and processing image data (eg. OSM tiles, etc.)
+
+## Installation
+
+To install `srai` simply run:
+
+```bash
+pip install srai
+```
+
+This will install the `srai` package and dependencies required by most of the use cases. There are several optional dependencies that can be installed to enable additional functionality. These are listed in the [optional dependencies](#optional-dependencies) section.
+
+### Optional dependencies
+
+The following optional dependencies can be installed to enable additional functionality:
+
+* `srai[all]` - all optional dependencies
+* `srai[osm]` - dependencies required to download OpenStreetMap data
+* `srai[voronoi]` - dependencies to use Voronoi-based regionization method
+* `srai[gtfs]` - dependencies to process GTFS data
+* `srai[plotting]` - dependencies to plot graphs and maps
+* `srai[torch]` - dependencies to use torch-based embedders
+
+## Usage
+
+### Downloading OSM data
+
+To download OSM data for a given area, using a set of tags use one of `OSMLoader` classes:
+
+* `OSMOnlineLoader` - downloads data from OpenStreetMap API using [osmnx](https://github.com/gboeing/osmnx) - this is faster for smaller areas or tags counts
+* `OSMPbfLoader` - loads data from automatically downloaded PBF file from [protomaps](https://protomaps.com/) - this is faster for larger areas or tags counts
+
+Example with `OSMOnlineLoader`:
+
+```python
+from srai.loaders import OSMOnlineLoader
+from srai.utils import geocode_to_region_gdf
+from srai.plotting import plot_regions
+
+query = {"leisure": "park"}
+area = geocode_to_region_gdf("Wrocław, Poland")
+loader = OSMOnlineLoader()
+
+parks_gdf = loader.load(area, query)
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0)"], tiles_style="CartoDB positron")
+parks_gdf.explore(m=folium_map, color="forestgreen")
+```
+
+
+
+
+
+### Downloading road network
+
+Road network downloading is a special case of OSM data downloading. To download road network for a given area, use `OSMWayLoader` class:
+
+```python
+from srai.loaders import OSMWayLoader
+from srai.loaders.osm_way_loader import NetworkType
+from srai.utils import geocode_to_region_gdf
+from srai.plotting import plot_regions
+
+area = geocode_to_region_gdf("Utrecht, Netherlands")
+loader = OSMWayLoader(NetworkType.BIKE)
+
+nodes, edges = loader.load(area)
+
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0.1)"], tiles_style="CartoDB positron")
+edges[["geometry"]].explore(m=folium_map, color="seagreen")
+```
+
+
+
+
+
+### Downloading GTFS data
+
+To extract features from GTFS use `GTFSLoader`. It will extract trip count and available directions for each stop in 1h time windows.
+
+```python
+from pathlib import Path
+
+from srai.loaders import GTFSLoader
+from srai.utils import geocode_to_region_gdf, download_file
+from srai.plotting import plot_regions
+
+area = geocode_to_region_gdf("Vienna, Austria")
+gtfs_file = Path("vienna_gtfs.zip")
+download_file("https://transitfeeds.com/p/stadt-wien/888/latest/download", gtfs_file.as_posix())
+loader = GTFSLoader()
+
+features = loader.load(gtfs_file)
+
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0.1)"], tiles_style="CartoDB positron")
+features[["trips_at_8", "geometry"]].explore("trips_at_8", m=folium_map)
+```
+
+
+
+
+
+### Regionization
+
+Regionization is a process of dividing a given area into smaller regions. This can be done in a variety of ways:
+
+* `H3Regionizer` - regionization using [Uber's H3 library](https://github.com/uber/h3)
+* `S2Regionizer` - regionization using [Google's S2 library](https://github.com/google/s2geometry)
+* `VoronoiRegionizer` - regionization using Voronoi diagram
+* `AdministativeBoundaryRegionizer` - regionization using administrative boundaries
+
+Example:
+
+```python
+from srai.regionizers import H3Regionizer
+from srai.utils import geocode_to_region_gdf
+
+area = geocode_to_region_gdf("Berlin, Germany")
+regionizer = H3Regionizer(resolution=7)
+
+regions = regionizer.transform(area)
+
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0.1)"], tiles_style="CartoDB positron")
+plot_regions(regions_gdf=regions, map=folium_map)
+```
+
+
+
+
+
+### Embedding
+
+Embedding is a process of mapping regions into a vector space. This can be done in a variety of ways:
+
+* `Hex2VecEmbedder` - embedding using hex2vec[1] algorithm
+* `GTFS2VecEmbedder` - embedding using GTFS2Vec[2] algorithm
+* `CountEmbedder` - embedding based on features counts
+* `ContextualCountEmbedder` - embedding based on features counts with neighbourhood context (proposed in [3])
+* `Highway2VecEmbedder` - embedding using Highway2Vec[4] algorithm
+
+All of those methods share the same API. All of them require results from `Loader` (load features), `Regionizer` (split area into regions) and `Joiner` (join features to regions) to work. An example using `CountEmbedder`:
+
+```python
+from srai.embedders import CountEmbedder
+from srai.joiners import IntersectionJoiner
+from srai.loaders import OSMOnlineLoader
+from srai.plotting import plot_regions, plot_numeric_data
+from srai.regionizers import H3Regionizer
+from srai.utils import geocode_to_region_gdf
+
+loader = OSMOnlineLoader()
+regionizer = H3Regionizer(resolution=9)
+joiner = IntersectionJoiner()
+
+query = {"amenity": "bicycle_parking"}
+area = geocode_to_region_gdf("Malmö, Sweden")
+features = loader.load(area, query)
+regions = regionizer.transform(area)
+joint = joiner.transform(regions, features)
+
+embedder = CountEmbedder()
+embeddings = embedder.transform(regions, features, joint)
+
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0.1)"], tiles_style="CartoDB positron")
+plot_numeric_data(regions, embeddings, "amenity_bicycle_parking", map=folium_map)
+```
+
+
+
+
+
+`CountEmbedder` is a simple method, which does not require fitting. Other methods, such as `Hex2VecEmbedder` or `GTFS2VecEmbedder` require fitting and can be used in a similar way to `scikit-learn` estimators:
+
+```python
+from srai.embedders import Hex2VecEmbedder
+from srai.joiners import IntersectionJoiner
+from srai.loaders import OSMPbfLoader
+from srai.loaders.osm_loaders.filters import HEX2VEC_FILTER
+from srai.neighbourhoods.h3_neighbourhood import H3Neighbourhood
+from srai.regionizers import H3Regionizer
+from srai.utils import geocode_to_region_gdf
+from srai.plotting import plot_regions, plot_numeric_data
+
+loader = OSMPbfLoader()
+regionizer = H3Regionizer(resolution=11)
+joiner = IntersectionJoiner()
+
+area = geocode_to_region_gdf("City of London")
+features = loader.load(area, HEX2VEC_FILTER)
+regions = regionizer.transform(area)
+joint = joiner.transform(regions, features)
+
+embedder = Hex2VecEmbedder()
+neighbourhood = H3Neighbourhood(regions_gdf=regions)
+
+embedder = Hex2VecEmbedder([15, 10, 3])
+
+# Option 1: fit and transform
+# embedder.fit(regions, features, joint, neighbourhood, batch_size=128)
+# embeddings = embedder.transform(regions, features, joint)
+
+# Option 2: fit_transform
+embeddings = embedder.fit_transform(regions, features, joint, neighbourhood, batch_size=128)
+
+folium_map = plot_regions(area, colormap=["rgba(0,0,0,0.1)"], tiles_style="CartoDB positron")
+plot_numeric_data(regions, embeddings, 0, map=folium_map)
+```
+
+
+
+
+
+### Plotting, utilities and more
+
+We also provide utilities for different spatial operations and plotting functions adopted to data formats used in `srai` For a full list of available methods, please refer to the [documentation](https://srai-lab.github.io/srai).
+
+## Contributing
+
+If you are willing to contribute to `srai`, feel free to do so! Visit [our contributing guide](./CONTRIBUTING.md) for more details.
+
+## Publications
+
+Some of the methods implemented in `srai` have been published in scientific journals and conferences.
+
+1. Szymon Woźniak and Piotr Szymański. 2021. Hex2vec: Context-Aware Embedding H3 Hexagons with OpenStreetMap Tags. In Proceedings of the 4th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GEOAI '21). Association for Computing Machinery, New York, NY, USA, 61–71. [paper](https://doi.org/10.1145/3486635.3491076), [arXiv](https://arxiv.org/abs/2111.00970)
+2. Piotr Gramacki, Szymon Woźniak, and Piotr Szymański. 2021. Gtfs2vec: Learning GTFS Embeddings for comparing Public Transport Offer in Microregions. In Proceedings of the 1st ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data (GeoSearch'21). Association for Computing Machinery, New York, NY, USA, 5–12. [paper](https://doi.org/10.1145/3486640.3491392), [arXiv](https://arxiv.org/abs/2111.00960)
+3. Kamil Raczycki and Piotr Szymański. 2021. Transfer learning approach to bicycle-sharing systems' station location planning using OpenStreetMap data. In Proceedings of the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities (ARIC '21). Association for Computing Machinery, New York, NY, USA, 1–12. [paper](https://doi.org/10.1145/3486626.3493434), [arXiv](https://arxiv.org/abs/2111.00990)
+4. Kacper Leśniara and Piotr Szymański. 2022. Highway2vec: representing OpenStreetMap microregions with respect to their road network characteristics. In Proceedings of the 5th ACM SIGSPATIAL International Workshop on AI for Geographic Knowledge Discovery (GeoAI '22). Association for Computing Machinery, New York, NY, USA, 18–29. [paper](https://doi.org/10.1145/3557918.3565865)
+
+## Citation
+
+TBD
+
+## License
+
+This library is licensed under the [Apache Licence 2.0](https://github.com/srai-lab/srai/blob/main/LICENSE.md).
+
+The free [OpenStreetMap](https://www.openstreetmap.org/) data, which is used for the development of SRAI, is licensed under the [Open Data Commons Open Database License](https://opendatacommons.org/licenses/odbl/) (ODbL) by the [OpenStreetMap Foundation](https://osmfoundation.org/) (OSMF).
diff --git a/docs/README.md b/docs/README.md
index 5a8d4698d..c975ee9b2 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -62,3 +62,9 @@ Some of the methods implemented in `srai` have been published in scientific jour
1. https://h3geo.org/
2. https://doi.org/10.1145/3486635.3491076
+
+## Licence
+
+This library is licensed under the [Apache Licence 2.0](https://github.com/srai-lab/srai/blob/main/LICENSE.md).
+
+The free [OpenStreetMap](https://www.openstreetmap.org/) data, which is used for the development of SRAI, is licensed under the [Open Data Commons Open Database License](https://opendatacommons.org/licenses/odbl/) (ODbL) by the [OpenStreetMap Foundation](https://osmfoundation.org/) (OSMF).
diff --git a/docs/assets/images/downloading_gtfs_data.jpg b/docs/assets/images/downloading_gtfs_data.jpg
new file mode 100644
index 000000000..994bced93
Binary files /dev/null and b/docs/assets/images/downloading_gtfs_data.jpg differ
diff --git a/docs/assets/images/downloading_osm_data.jpg b/docs/assets/images/downloading_osm_data.jpg
new file mode 100644
index 000000000..de6d214e9
Binary files /dev/null and b/docs/assets/images/downloading_osm_data.jpg differ
diff --git a/docs/assets/images/downloading_road_network_data.jpg b/docs/assets/images/downloading_road_network_data.jpg
new file mode 100644
index 000000000..6d27dbb2c
Binary files /dev/null and b/docs/assets/images/downloading_road_network_data.jpg differ
diff --git a/docs/assets/images/embedding_count_embedder.jpg b/docs/assets/images/embedding_count_embedder.jpg
new file mode 100644
index 000000000..6b4dd0a90
Binary files /dev/null and b/docs/assets/images/embedding_count_embedder.jpg differ
diff --git a/docs/assets/images/embedding_hex2vec_embedder.jpg b/docs/assets/images/embedding_hex2vec_embedder.jpg
new file mode 100644
index 000000000..3ea8867d1
Binary files /dev/null and b/docs/assets/images/embedding_hex2vec_embedder.jpg differ
diff --git a/docs/assets/images/regionization.jpg b/docs/assets/images/regionization.jpg
new file mode 100644
index 000000000..c19b4758a
Binary files /dev/null and b/docs/assets/images/regionization.jpg differ
diff --git a/examples/regionizers/administrative_boundary_regionizer.ipynb b/examples/regionizers/administrative_boundary_regionizer.ipynb
index 03f8fffb8..6680fb0ef 100644
--- a/examples/regionizers/administrative_boundary_regionizer.ipynb
+++ b/examples/regionizers/administrative_boundary_regionizer.ipynb
@@ -130,7 +130,7 @@
"metadata": {},
"outputs": [],
"source": [
- "eu_bbox = box(minx=-10.478556, miny=34.633284672291, maxx=34.597916, maxy=70.096054)\n",
+ "eu_bbox = box(minx=-10.478556, miny=34.633284672291, maxx=32.097916, maxy=70.096054)\n",
"eu_bbox_gdf = gpd.GeoDataFrame({\"geometry\": [eu_bbox]}, crs=\"EPSG:4326\")"
]
},
@@ -324,7 +324,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.8.10"
+ "version": "3.11.2"
},
"vscode": {
"interpreter": {
diff --git a/mkdocs.yml b/mkdocs.yml
index cade3188c..fc48d2b12 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -116,6 +116,8 @@ plugins:
show_root_heading: false
show_root_toc_entry: false
docstring_section_style: "spacy"
+ separate_signature: true
+ show_signature_annotations: true
- mkdocs-jupyter:
include: ["*.ipynb"]
ignore_h1_titles: true
diff --git a/srai/loaders/gtfs_loader.py b/srai/loaders/gtfs_loader.py
index 2999d106e..8656152af 100644
--- a/srai/loaders/gtfs_loader.py
+++ b/srai/loaders/gtfs_loader.py
@@ -53,7 +53,7 @@ def load(
Args:
gtfs_file (Path): Path to the GTFS feed.
fail_on_validation_errors (bool): Fail if GTFS feed is invalid. Ignored when
- skip_validation is True.
+ skip_validation is True.
skip_validation (bool): Skip GTFS feed validation.
Returns: