Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs #872

Merged
merged 11 commits into from
Jan 8, 2025
77 changes: 67 additions & 10 deletions docs/getting_started/comparison.rst
Original file line number Diff line number Diff line change
@@ -1,20 +1,77 @@
.. _comparison:

===========================
Comparison with other tools
===========================
***************************
Comparison with Other Tools
***************************

Similar tools
There are many open-source projects for training machine learning models.
DeepForest aims to complement these existing tools by providing a specialized, streamlined approach tailored to ecological and environmental monitoring tasks.
Below, we compare DeepForest with other notable tools in this space, highlighting similarities, differences, and areas of potential collaboration.

-------------
Similar Tools
-------------

There are many open-source projects for training machine learning models. We see DeepForest as a complement to many existing and excellent packages.
`Roboflow <https://roboflow.com>`_ offers a comprehensive ecosystem for computer vision tasks, including tools for:

- **Supervision:** Efficient dataset annotation and augmentation.
- **Inference:** API-driven deployment of machine learning models.

The ecosystem is well-executed and widely used within DeepForest.
However, Roboflow operates as a commercial platform requiring an API key and has a range of licensing structures.
Its broad scope makes it challenging to identify robust models among thousands of projects.

**Key Differences:**

1. Roboflow is designed as an all-encompassing platform for general computer vision applications.
2. DeepForest focuses on a curated set of models tailored to ecological and environmental monitoring, offering simplicity and specificity for existing workflows.

`Torchgeo <https://github.com/microsoft/torchgeo>`_, developed by Microsoft, is a Python library for automating remote sensing machine learning. It emphasizes:

- **Raster-based Remote Sensing:** Primarily focused on earth-facing satellite data.
- **Pretrained Models and Datasets:** Provides curated resources for remote sensing tasks.

Torchgeo caters to an audience with significant machine learning expertise and is particularly suited for satellite and aerial imagery analysis.

**Key Features:**

1. Modular design for flexibility and scalability.
2. Extensive support for raster data processing.

**Collaboration Opportunities:**

DeepForest and Torchgeo share common goals in environmental monitoring. By enhancing interoperability, both tools could enable unified workflows and reduce redundant efforts.

`AIDE <https://github.com/microsoft/aerial_wildlife_detection>`_ is a modular web framework for annotating image datasets and training deep learning models. It integrates manual annotation and machine learning into an active learning loop:

- Humans annotate initial images.
- The system trains a model.
- The model predicts and selects additional images for annotation.

This approach accelerates tasks like wildlife surveys using aerial imagery.

**Key Features:**

- **Dual functionality**: Annotation and AI-assisted training.
- **Configurable** for various tasks, particularly ecological applications.
- **Active learning** loop for iterative model improvement.

Although AIDE has not been updated recently, it remains a powerful tool for ecological monitoring.

------------------------
Vision for Collaboration
------------------------

* Roboflow
DeepForest emphasizes the importance of collaboration in the open-source community. By connecting with tools like Roboflow, Torchgeo, and AIDE, we can:

The `supervision <https://supervision.roboflow.com/latest/>`_, `inference <https://inference.roboflow.com/>`_ and related packages within Roboflow's ecosystem are well executed and used throughout DeepForest. The inference machine underlying Roboflow requires connection to Roboflow, a computer vision software company which requires an API key, and has a range of commercial and license structures. We think of DeepForest as a small set of curated models that are targeted towards the ecological and environmental monitoring community. Finding robust models is challenging amongst the thousands of Roboflow projects. Roboflow is designed to be an all-encompassing ecosystem, whereas DeepForest is intentionally small and aimed at existing pipelines.
- Standardize data formats for seamless integration.
- Share best practices for model training and deployment.
- Minimize duplication of effort and maximize community impact.

* Torchgeo
We invite users and contributors from all packages to share ideas and propose improvements to serve the community better.

`Torchgeo <https://github.com/microsoft/torchgeo>`_ is a Python library written by developers at Microsoft to help automate remote sensing machine learning. Torchgeo has general structures, but the documents and general structure are focused on raster-based remote sensing, especially using earth-facing satellite data. Torchgeo has a number of useful datasets and curates pretrained models for remote sensing applications. The Torchgeo audience is generally more experienced with machine learning than the average DeepForest user.
**Conclusion**

We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.
The future of open-source machine learning in environmental monitoring relies on collaboration and interoperability.
Tools like DeepForest, Torchgeo, Roboflow, and AIDE complement each other, each addressing specific needs within the field.
By fostering connections between these tools, we can build a more cohesive and efficient ecosystem for solving critical environmental challenges.
7 changes: 7 additions & 0 deletions docs/getting_started/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ DeepForest has Windows, Linux and OSX prebuilt wheels on pypi. We
*strongly* recommend using a conda or virtualenv to create a clean
installation container.

For example

::

conda create -n DeepForest python=3.11
conda activate DeepForest

::

pip install DeepForest
Expand Down
12 changes: 4 additions & 8 deletions docs/getting_started/intro_tutorials/02_model_loader.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,19 @@ The `load_model` function loads a pretrained model from Hugging Face using the r

### Example Usage

#### Load a Model
#### Load a Model and Predict an Image

```python
from deepforest import main
from deepforest import get_data
import matplotlib.pyplot as plt

from deepforest.visualize import plot_results
# Initialize the model class
model = main.deepforest()

# Load a pretrained tree detection model from Hugging Face
model.load_model(model_name="weecology/deepforest-tree", revision="main")

sample_image_path = get_data("OSBS_029.png")
img = model.predict_image(path=sample_image_path, return_plot=True)

plt.imshow(img[:,:,::-1])
plt.show()

img = model.predict_image(path=sample_image_path)
plot_results(img)
```
18 changes: 6 additions & 12 deletions docs/getting_started/intro_tutorials/03_use_pretrained_model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,24 +3,18 @@ How do I use a pretrained model to predict an image?

.. code-block:: python

from deepforest import main, get_data
import matplotlib.pyplot as plt

# Initialize the model
from deepforest import main
from deepforest import get_data
from deepforest.visualize import plot_results
# Initialize the model class
model = main.deepforest()

# Load a pretrained tree detection model from Hugging Face
model.load_model(model_name="weecology/deepforest-tree", revision="main")

# Get the sample image path and predict image
sample_image_path = get_data("OSBS_029.png")
img = model.predict_image(path=sample_image_path, return_plot=True)

# predict_image returns plot in BlueGreenRed (opencv style), but matplotlib likes RedGreenBlue
# Switch the channel order for correct display
plt.imshow(img[:,:,::-1])
plt.show()

img = model.predict_image(path=sample_image_path)
plot_results(img)

.. image:: ../../../www/getting_started1.png
:align: center
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ How do I predict on large geospatial tiles?
Predict a tile
~~~~~~~~~~~~~~

Large tiles covering wide geographic extents cannot fit into memory during prediction and would yield poor results due to the density of bounding boxes. Often provided as geospatial .tif files, remote sensing data is best suited for the ``predict_tile`` function, which splits the tile into overlapping windows, performs prediction on each of the windows, and then reassembles the resulting annotations.
Large tiles covering wide geographic extents cannot fit into memory during prediction and would yield poor results due to the density of bounding boxes. Often provided as geospatial .tif files, remote sensing data is best suited for the ``predict_tile`` function, which splits the tile into overlapping windows, performs prediction on each of the windows, and then reassembles the resulting annotations. Overlapping detections are removed based on the ``iou_threshold`` parameter.

Let’s show an example with a small image. For larger images, patch_size should be increased.

Expand Down
4 changes: 2 additions & 2 deletions docs/getting_started/intro_tutorials/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Getting started tutorials
.. toctree::
:maxdepth: 1

01_load_sample_data
02_model_loader.md
03_use_pretrained_model
04_predict_large_tile
04_predict_large_tile
load_sample_data
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
How do I use the package Sample data?
How do I use the package sample data?
=====================================

Sample data
Expand Down
22 changes: 3 additions & 19 deletions docs/getting_started/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Package overview
What is DeepForest?
*******************

DeepForest is a python package for training and predicting ecological objects in airborne imagery. DeepForest comes with prebuilt models for immediate use and fine-tuning by annotating and training custom models on your own data. DeepForest models can also be extended to species classification based on new data. DeepForest is designed for:
DeepForest is a python package for training and predicting ecological objects in airborne imagery. DeepForest comes with prebuilt models for immediate use and fine-tuning by annotating and training custom models on your own data. DeepForest models can also be extended to classification (e.g., species) based on new data. DeepForest is designed for:

1. Applied researchers with limited machine learning experience
2. Applications with limited data that can be supported by prebuilt models
Expand Down Expand Up @@ -38,7 +38,7 @@ Practical Intro to Computer Vision in Ecology Research

Where can I get help, learn from others, and report bugs?
---------------------------------------------------------
Given the enormous array of forest types and image acquisition environments, it is unlikely that your image will be perfectly predicted by a prebuilt model. Below are some tips and general guidelines to improve predictions.
Given the enormous array of taxa, background and image acquisition environments, it is unlikely that your image will be perfectly predicted by a prebuilt model. Check out the 'training', 'annotation', and 'predicting' sections of the documentation for more information on how to improve predictions using your own data.

Get suggestions on how to improve a model by using the `discussion board <https://github.com/weecology/DeepForest/discussions>`_. Please be aware that only feature requests or bug reports should be posted on the issues page. The most helpful thing you can do is leave feedback on the DeepForest `issue page`_. No feature, issue, or positive affirmation is too small. Please do it now!

Expand All @@ -64,7 +64,7 @@ DeepForest is an open-source python project that depends on user contributions.

* Making recommendations to the API and workflow. Please open an issue for anything that could help reduce friction and improve user experience.
* Leading implementations of new features. Check out the 'good first issue' tag on the repo and get in touch with the maintainers and tell us about your skills.
* Data contributions! The DeepForest backbone tree and bird models are not perfect. Please consider posting any annotations you make on Zenodo, or sharing them with DeepForest maintainers. Open an `issue <https://github.com/weecology/DeepForest/issues>`_ and tell us about the RGB data and annotations. For example, we are collecting tree annotations to create an `open-source benchmark <https://milliontrees.idtrees.org/>`_. Please consider sharing data to make the models stronger and benefit you and other users.
* Data contributions! The DeepForest backbone models are not perfect. Please consider posting any annotations you make on Zenodo, or sharing them with DeepForest maintainers. Open an `issue <https://github.com/weecology/DeepForest/issues>`_ and tell us about the RGB data and annotations. For example, we are collecting tree annotations to create an `open-source benchmark <https://milliontrees.idtrees.org/>`_. Please consider sharing data to make the models stronger and benefit you and other users.

Citation
--------
Expand All @@ -80,22 +80,6 @@ The second is the paper describing the particular model. See `Prebuilt Setup <..

.. _issue page: https://github.com/weecology/DeepForest/issues

Similar tools
-------------

There are many open-source projects for training machine learning models. We see DeepForest as a complement to many existing and excellent packages.

* Roboflow

The `supervision <https://supervision.roboflow.com/latest/>`_, `inference <https://inference.roboflow.com/>`_ and related packages within Roboflow's ecosystem are well executed and used throughout DeepForest. The inference machine underlying Roboflow requires connection to Roboflow, a computer vision software company which requires an API key, and has a range of commercial and license structures. We think of DeepForest as a small set of curated models that are targeted towards the ecological and environmental monitoring community. Finding robust models is challenging amongst the thousands of Roboflow projects. Roboflow is designed to be an all-encompassing ecosystem, whereas DeepForest is intentionally small and aimed at existing pipelines.

* Torchgeo

`Torchgeo <https://github.com/microsoft/torchgeo>`_ is a Python library written by developers at Microsoft to help automate remote sensing machine learning. Torchgeo has general structures, but the documents and general structure are focused on raster-based remote sensing, especially using earth-facing satellite data. Torchgeo has a number of useful datasets and curates pretrained models for remote sensing applications. The Torchgeo audience is generally more experienced with machine learning than the average DeepForest user.

We hope to continue to connect with both Roboflow and Torchgeo to improve interoperability among all model types and training. The future of open-source depends on collaboration, and we welcome users from all packages to submit ideas on how best to serve the community and reduce any duplication and wasted effort. There are many packages that hold useful individual models (e.g., `DetectTree2 <https://github.com/PatBall1/detectree2>`_) related to individual scientific publications. Our hope with DeepForest is to wrap general routines beyond individual research projects to make machine learning applications to environmental monitoring easier.


License
-------

Expand Down
5 changes: 3 additions & 2 deletions docs/user_guide/01_Reading_data.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,12 @@ The most time-consuming part of many open-source projects is getting the data in

## Annotation Geometries and Coordinate Systems

DeepForest was originally designed for bounding box annotations. As of DeepForest 1.4.0, point and polygon annotations are also supported. There are two ways to format annotations, depending on the annotation platform you are using. `read_file` can read points, polygons, and boxes, in both image coordinate systems (relative to image origin at top-left 0,0) as well as projected coordinates on the Earth's surface. The `read_file` method also appends the location of the current image directory as an attribute. To access this attribute use
DeepForest was originally designed for bounding box annotations. As of DeepForest 1.4.0, point and polygon annotations are also supported. There are two ways to format annotations, depending on the annotation platform you are using. `read_file` can read points, polygons, and boxes, in both image coordinate systems (relative to image origin at top-left 0,0) as well as projected coordinates on the Earth's surface. The `read_file` method also appends the location of the current image directory as an attribute. To access this attribute use the `root_dir` attribute.

```
filename = get_data("OSBS_029.csv")

df = utilities.read_file(filename)
df.root_dir
```

**Note:** For CSV files, coordinates are expected to be in the image coordinate system, not projected coordinates (such as latitude/longitude or UTM).
Expand Down
8 changes: 7 additions & 1 deletion docs/user_guide/02_prebuilt.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Prebuilt models

DeepForest has a few prebuilt models.
DeepForest comes with prebuilt models to help you get started. These models are available on Hugging Face and are loaded using the `load_model` function, they always are seen as the starting point for further training, rather than a general purpose tool for new imagery.

## Tree Crown Detection model

Expand Down Expand Up @@ -38,6 +38,12 @@ We have created a [GPU colab tutorial](https://colab.research.google.com/drive/1

For more information, or specific questions about the bird detection, please create issues on the [BirdDetector repo](https://github.com/weecology/BirdDetector)

## Livestock Detectors model

This model has a single label 'cattle' trained on drone imagery of cows, sheep and other large mammals in agricultural settings. The model was trained on data from [insert countries and other metadata about landscapes].

![image](../../www/livestock-example.png)

## Crop Classifiers model

### Alive/Dead trees model
Expand Down
2 changes: 1 addition & 1 deletion docs/user_guide/06_multi_species.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ m = main.deepforest(config_args={"num_classes":2},label_dict={"Alive":0,"Dead":1
```

It is often, but not always, useful to start with a prebuilt model when trying to identify multiple species. This helps the model focus on learning the multiple classes and not waste data and time re-learning bounding boxes. To load the backbone and box prediction portions of the release model, but create a classification model for more than one species.
Here is an example using the alive/dead tree data stored in the package, but the same logic applies to the bird detector.
Here is an example using the alive/dead tree data stored in the package, but the same logic applies to other detectiors.

``` python
# Initialize new Deepforest model ( the model that you will train ) with your classes
Expand Down
Loading
Loading