Simple library for detecting gibberish tiles from histopathological whole-slide images (WSI).
By gibberish tiles I mean tiles with pen marks and similar artifacts:
wsi_tile_cleanup
detects background tiles (based on the Otsu algorithm), red / green / blue pen marks, and black artifacts.
The typical use case for wsi_tile_cleanup
is preprocessing whole-slide images (WSI) before loading tiles into a neural network (yes, deep learning).
If you are building a deep learning pipeline, the following repositories might be of interest: @lucasrla/wsi-preprocessing and @lucasrla/wsi-preprocessing-sos-workflow.
conda create --name YOUR_ENV_NAME --channel conda-forge python=3.6 libvips pyvips numpy
conda activate YOUR_ENV_NAME
python3.6 -m pip install git+https://github.com/lucasrla/wsi-tile-cleanup.git
# note: `python3.6 -m pip` is to make sure we are using pip from python=3.6
# first of all, install libvips
# https://libvips.github.io/libvips/install.html
# (tip: have it installed with openslide support)
# next, create a new virtualenv and activate it using your tool of choice
# (e.g., pyenv, virtualenv, etc)
# then, depending on your dependency manager, run either:
poetry add git+https://github.com/lucasrla/wsi-tile-cleanup.git
# or
pip install git+https://github.com/lucasrla/wsi-tile-cleanup.git
from wsi_tile_cleanup import filters, utils
img = utils.read_image("data/images/tiles/5.jpeg")
bands = utils.split_rgb(img)
colors = ["red", "green", "blue"]
for color in colors:
perc = filters.pen_percentage(bands, color)
print(f"{color}: {perc:.5f}")
See also: examples.py
.
Please note that wsi_tile_cleanup
is just a very thin wrapper around libvips
, pyvips
and numpy
. They are the ones doing the heavy lifting (and doing it amazingly well).
-
libvips: A fast image processing library with low memory needs. The official Python bindings are called pyvips.
-
NumPy: The fundamental package for scientific computing with Python.
-
deep-histopath: I ported some of their filters to
pyvips
. If you are interested in a preprocessing pipeline for deep learning, check out their nice write-up. -
scikit-image: I ported their implementation of the Otsu algorithm to
pyvips
andnumpy
. -
Cancer Digital Slide Archive: TCGA slides hosted online by the Winship Cancer Institute at Emory University.
wsi-tile-cleanup
is Free Software distributed under the GNU General Public License v3.0.
Dependencies have their own licenses, check them out.