Skip to content
This repository has been archived by the owner on Jan 12, 2024. It is now read-only.

Release v 0.1.0

Pre-release
Pre-release
Compare
Choose a tag to compare
@justusschock justusschock released this 22 Feb 22:33
· 636 commits to master since this release
a2025be

This release candidate is the first ever release containing the most convenient transforms as well as a brand new DataLoader class.

New Features

  • Option to Compose single transforms [Commit]
  • First spatial Transforms [Commit]
  • Custom Dataloader as Drop-In replacement for PyTorch [Commit]
  • Custom Dataset Class to extend the PyTorch dataset [Commit]
  • Transform call dispatching algorithm can now be changed by user [Commit]
  • Utility Transforms [Commit]
  • User-Controllable call dispatch within the Compose class [Commit]
  • Basic Affine Transforms [Commit]

Bug Fixes/Enhancements

  • Shared memory for progressive resizing [Commit]

Breaking Changes

--None--

Deprecations

--None--

Removals

--None--

Towards the API

The new DataLoader class was defined to be an exact drop-in replacement for the loader provided by PyTorch. As such it only extends the PyTorch loader but under the hood it uses the same multiprocessing and structure as the one already available.

  • Support for Batch-Transformations
  • Support for GPU Transformations
  • Support to disable automated conversion to torch.Tensor
  • Support to specify, how the transform call should be dispatched
  • Automated seed for numpy in all child processes

However there are some limitations to this:

  • using CUDA from different processes is troublesome on most machines
  • GPU transforms will therefore always be executed after the CPU batch_transforms

The usage is quite simple: Just replace the import! all your previous keywords are still working correctly, but there are some new keywords available for the previously mentioned features:

Before:

from torch.utils.data import DataLoader
dset = torchvision.datasets.MNIST(root='/tmp', dowload=True)
# this does not seed numpy per worker
loader = DataLoader(dset, shuffle=True, num_workers=4)

Ours:

from rising.loading import DataLoader
dset = torchvision.datasets.MNIST(root='/tmp', dowload=True)
# this also seeds numpy!
loader = DataLoader(dset, shuffle=True, num_workers=4)

Note: To seed numpy for all child processes in a reproducible way, it must be seeded in the main process, since the base-seed for all the child processes is computed within the main process!