Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Refactoring #59

Merged
merged 61 commits into from
Dec 14, 2023
Merged

WIP: Refactoring #59

merged 61 commits into from
Dec 14, 2023

Conversation

BerndDoser
Copy link
Member

@BerndDoser BerndDoser commented Dec 7, 2023

Comprehensive refactoring of the data modules and the HiPSter in preparation for the applicability of different data structures.

Derived class DatasetWithMetadata to provide metadata

The dataset classes return the pure data tensor for training. The derived classes DatasetWithMetadata return a tuple of data and metadata, which is currently only needed for the HiPSter. This ensures that the access also works for shuffled dataloaders.

Move data structure-dependent functionality from HiPster into DataModule

Move write_catalog, create_images and create_thumbnails from HiPster into DataModule

Rotation invariant

The best rotation search function find_best_rotation is moved to the base class SpherinatorModule to avoid duplicated code for training and HiPSter. (here)

Bugfix: No variational sampling in rotational invariance

The variational sampling should be disabled in searching for the best rotation (encode instead of forward).

Base classes SpherinatorDataset and SpherinatorDataModule

Abstract base classes for dataset and DataModules to ensure that all methods for HiPSter are implemented.

Miscellaneous

and preprocessing

- no direct tensor support in dataset
- preprocessing: replace directory by images
- transforms.ToTensor
sort listdir to reproduce dataset filenames
+ use torchvision.transforms.v2
+ add dataloaders processing, images, thumbnail_images
+ test jit.script
- split member function using Mixin
(https://www.qtrac.eu/pyclassmulti.html)
- set default number of workers to all threads
@BerndDoser BerndDoser linked an issue Dec 7, 2023 that may be closed by this pull request
@BerndDoser BerndDoser merged commit 91636a2 into HITS-AIN:main Dec 14, 2023
@BerndDoser BerndDoser deleted the data branch December 14, 2023 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Abstract classes SpherinatorDataModule and SpherinatorDataSet Misuse of val_dataloader
1 participant