Skip to content

Latest commit

 

History

History
69 lines (45 loc) · 4.37 KB

README.md

File metadata and controls

69 lines (45 loc) · 4.37 KB

🔥Accepted in TMLR (08/2024) OpenReview

Abstract

Traditional deep learning models are trained and tested on relatively low-resolution images ($<300$ px), and cannot be directly operated on large-scale images due to compute and memory constraints. We propose Patch Gradient Descent (PatchGD), an effective learning strategy that allows us to train the existing CNN and transformer architectures (hereby referred to as deep learning models) on large-scale images in an end-to-end manner. PatchGD is based on the hypothesis that instead of performing gradient-based updates on an entire image at once, it should be possible to achieve a good solution by performing model updates on only small parts of the image at a time, ensuring that the majority of it is covered over the course of iterations. PatchGD thus extensively enjoys better memory and compute efficiency when training models on large-scale images. PatchGD is thoroughly evaluated on PANDA, UltraMNIST, TCGA, and ImageNet datasets with ResNet50, MobileNetV2, ConvNeXtV2, and DeiT models under different memory constraints. Our evaluation clearly shows that PatchGD is much more stable and efficient than the standard gradient-descent method in handling large images, especially when the compute memory is limited.

Code usage details

Setup the environment

Create a conda envirnoment:

conda create -n pgd python=3.12
conda activate pgd

Install requirements using the following command:

pip install -r requirements.txt

Data

Experiments mentioned in the paper use the following datasets:

  1. Prostate cANcer graDe Assessment (PANDA)
  2. UltraMNIST
  3. ImageNet
  4. TCGA

PANDA and UltraMNIST dataset processing scripts are included in the utility_codes directory where folds for PANDA and full dataset for UltraMNIST can be generated.

For ImageNet and TCGA(LUAD and LUSC), the dataset can be downloaded from Kaggle (for ImageNet) and (LUAD & LUSC with setup instructions listed here) the splits can be made from the dataset.

File structure

  • baselines directory contains the code to run baseline experiments mentioned in the paper for PANDA and UltraMNIST
  • HAR_1d_example directory contains code to run experiments on the Human Activity Recognition dataset (1-d generalization of PatchGD)
  • patch_gd directory contains the code to run experiments using PatchGD algorithm for PANDA and UltraMNIST
  • utility_codes directory contains utitlity codes for PANDA and UltraMNIST including creating dataset and folds, calculation of stats, running multiple experiments on multiple gpus etc.

Citation

Please cite using the following citation:

@article{gupta2023patch,
  title={Patch gradient descent: Training neural networks on very large images},
  author={Gupta, Deepak K and Mago, Gowreesh and Chavan, Arnav and Prasad, Dilip K},
  journal={arXiv preprint arXiv:2301.13817},
  year={2023}
}