[Master Issue] Add more models to torchvision #645

fmassa · 2018-10-30T10:27:49Z

This is a master issue to track requests for adding new pre-trained models to torchvision.

Here is the (potentially incomplete) list I compiled:

ResNext [Request] ResNet v2 and ResNext models and pretrained weights? #154 How about adding more advanced models (mobile net, DPN, etc) in vision ? #543 [Request] ResNet v2 and ResNext models and pretrained weights? #154
ResNet / ResNext with Group Norm Pretrained Group Normalization Models #631
mobilenet Add MobileNetV2 to torchvision #625
Inception family Add Inception V2, Inception V4, Inception Resnet V2 in models #490 [Pretrained model request] Inception v1 pretrained model in pytorch #537
~~NasNet NASNet Model #321~~ Implemented as MNasNet
SENet Is SENet (and new architectures) welcome to models? #260
ShuffleNet

@Cadene has already implemented a number of those models in his fantastic https://github.com/Cadene/pretrained-models.pytorch . I'll start from there and try to get models trained using pytorch/examples/imagenet, so that the models are reproducible.

Requirements

python implementation to live in vision/models
pre-trained weights using the same mean / std normalization as in the imagenet example
the script used to train the models, or the command-line arguments used if the script was exactly the one from examples/imagenet.

The text was updated successfully, but these errors were encountered:

gokriznastic · 2018-11-04T20:06:15Z

@fmassa Hey I would like to try adding some of these models. Can you tell me which ones of these you need help with?

fmassa · 2018-11-06T09:48:40Z

@gokriznastic ideally we would want to have not only the model implementation, but also the weights and the training code that was used (if different from pytorch/examples/imagenet).

This way, we have a reproducible way of obtaining the models.

I believe @tonylins will be adding support for MobileNetV2. All the others are open, so if you decide to take one, just let us know :-)

JuanFMontesinos · 2018-12-05T15:14:49Z

Hi there, I was wondering if you would be interested in a U-Net model. I've developed a very flexible version which allows variable depth, BN etcetera... I consider it's a very important model nowadays in audio and computer vision

fmassa · 2018-12-05T20:11:20Z

Hi @JuanFMontesinos
Definitely! Is this a segmentation model or a classification model?

Because for now, all of the models are for classification tasks, but we would like to extend it to other tasks as well (but it will require some thought so that we have the proper training / evaluation scripts)

JuanFMontesinos · 2018-12-05T20:39:26Z

@fmassa it was originally proposed as a segmentation architecture for biomedical applications. It is basically an encoder decoder architecture with skip connections widely used in blind sound source separation when working with spectrograms of the sound. It is also the core of GANs like pix2pix which is an imagr-to-image translation network and many others. That's why I suggested to include it. With respect to train it, it really depends on the application. Do you require a training framework and weights?

Regards

fmassa · 2018-12-06T12:35:55Z

Hi @JuanFMontesinos ,

I see. We currently require all models in torchvision to have pre-trained weights, and ideally a code-base where we can train / evaluate on it.
This becomes specially important for some complex models, like detection, where the model alone is generally not enough to be able to use it, and requires a number of helper functions.

JuanFMontesinos · 2018-12-10T09:50:54Z

@fmassa Hi, sorry for the late reply. Which task/dataset would you be interested in training U-Net for?

fmassa · 2018-12-11T10:59:55Z

U-Net are usually used for segmentation, so I'd say maybe Pascal VOC segmentation task or Cityscapes? But there might be newer benchmarks out there that I'm not aware of

guanfuchen · 2018-12-11T13:56:21Z

@fmassa VOC and Cityscapes dataset is large, there is a smaller dataset CamVid consisting 701 labeled images, and if you use U-Net for better performance, I think using the original medical dataset is better. Here is a good project named pytorch-semseg for semantic segmentation reimplementing U-Net.

fmassa · 2018-12-11T14:17:47Z

VOC and Cityscapes might be large, but there have been a number of publications using them, and they are widely used in the scientific literature. That's why I think providing pre-trained models for one of those tasks might be relevant.

JuanFMontesinos · 2018-12-13T08:44:58Z

Sooo let me evaluate it after christmas to see which dataset would be better

varunagrawal · 2019-01-05T23:24:57Z

Since we are adding MobileNet, it would be a good idea to add ShuffleNet as well given its improved performance over MobileNet.

setuc · 2019-02-04T05:26:11Z

Is there a priority among this list of models? I was planning to train a couple of models on Imagenet datasets from scratch and can contribute here.

Or are we to refer to the models from @Cadene

IgorKasianenko · 2019-02-04T06:00:14Z

@setuc I would really appreciate training ShuffleNet. This is small model so I assume it will take least time to start with it.
Sincerely yours
Igor

fmassa · 2019-02-11T13:58:51Z

Hi @setuc
Sorry for the delay in replying.

I'd say that you can pick whichever you'd prefer, but maybe ShuffleNet would be indeed easier because it's a small model.

I think Inception V4 might be quite hard to get to the reported accuracies, so maybe just ShuffletNet would be a great start already!

setuc · 2019-02-16T09:32:51Z

One more question @fmassa There are supposedly different version of Imagenet. I am currently using the one from Kaggle. I hope that should be sufficient. I have downloaded the images and plan to start the runs over the weekend.

hendrycks · 2019-02-16T20:08:24Z

There are supposedly different version of Imagenet.

Nearly everyone else is using ImageNet 2012 data, and most papers use that for comparisons.

setuc · 2019-02-17T10:37:50Z

@hendrycks i guess i was mistaken...the 2015 dataset is the same as that of 2012. I have started the runs and will do some validations before i share the results.Another 24 - 48 hours to completion.

fmassa · 2019-02-18T10:33:51Z

@setuc cool! Let me know how it goes, and which training script / hyperparameters you used to train it

setuc · 2019-02-18T13:54:32Z

@fmassa I have used the training script from here https://github.com/pytorch/examples/tree/master/imagenet) as it was mentioned in the requirements in the top post. All the HPs remained the same, except batch size, which was changed to 1024.
Unless we are free to play around with the learning rates (Cyclical learning rate etc), which i wasnt sure.

I am unable to reproduce the results for the network (Error 34.5% vs mine 39.811%) from the paper for 3 groups and no shuffle. My results are Acc@1 60.189 Acc@5 82.601

fmassa · 2019-02-18T16:00:01Z

@setuc thanks for getting back to me with the results.

I believe we might need to adapt the learning rate / etc in order to reproduce the results for many of those papers.

If you change those, let me know which changes you did, so that we can keep track of all of it and so that I can summarize it afterwards

setuc · 2019-02-23T16:30:57Z

Restarting the training. Rewrote the Shufflenet v1 and V2 together with the cyclical learning rate. I think I have it right this time around. Started the training expecting another 72-80 hours to report back.

Edit: The cyclical rates worked. At 120 epochs the results are encouraging. For Shufflenet v2, the Top-1 error is 41.31 compared to 39.70 from the paper.

Edit2: At 220 epochs, the Top-1 error for ShuffleNet v2 is 40.51 compared to 39.70 from the paper.

Edit3: At 272 epochs, the Top-1 error for ShuffleNet v2 is 40.22 compared to 39.70 from the paper.

Edit4: At 320 epochs, the Top-1 error for ShuffleNet v2 is 39.96 compared to 39.70 from the paper.

@fmassa Should I be doing all the groups / scales reported in the paper for v1 and v2?

setuc · 2019-02-26T23:15:31Z

@fmassa I have completed about 400 epochs with a Top-1 error of 39.85 compared to 39.70 from the paper. Should this be sufficient?

ppwwyyxx · 2019-02-26T23:26:21Z

For your reference, I've reproduced shufflenet v1 & v2 at https://github.com/tensorpack/tensorpack/blob/master/examples/ImageNetModels/shufflenet.py .
It follows the paper's schedule (240 epochs without cyclic LR trick) and gets the same accuracy.

fmassa · 2019-03-01T14:15:48Z

@setuc awesome! Could you check what @ppwwyyxx has sent to see if there is something else that you could do to get to the last few % so that we match the accuracies?

setuc · 2019-03-02T02:39:11Z

@fmassa I going over the code line by line and carrying out my comparisons from @ppwwyyxx. I had written the code from scratch. So checking again to see if I missed anything.

fmassa · 2019-03-06T22:35:42Z

@setuc thanks! Did you figure out where the difference was?

hendrycks · 2019-03-16T02:25:04Z

(I think it is unlikely the community outside FAIR is going to train various ImageNet models in a timely manner, especially big models such as ResNeXt.)

fmassa · 2019-03-18T18:10:38Z

@hendrycks I was planning on getting ResNeXt models trained here

1e100 · 2019-03-19T22:01:11Z

I have an implementation of MNASNet that I could contribute. Any interest from maintainers? It performs pretty well, and I was able to get close to paper numbers with it, at 1.0 depth multiplier, training with SGD+Nesterov. I think it's currently the best "efficient" model out there.

https://arxiv.org/abs/1807.11626

fmassa · 2019-03-22T21:25:40Z

Hi @1e100

Sure, it would be awesome to have it! Could you send a PR with it, and also pointing to the training code and hyperparameters that you used to obtain the results?

1e100 · 2019-03-23T01:28:16Z

Will do. My own training pipeline is far too complicated to be suitable for something like this, so I'll just implement a single-file fast.ai trainer instead, train with it to something close to paper numbers, and then send a PR. In the interest of expediency, I plan to only verify reachable accuracy for depth multiplier 1.0 under this experimental setup.

Let me know if you see any flaws in this plan. Conservative ETA is about 1 week, 90% of which will be GPU time.

1e100 · 2019-03-23T01:38:31Z

In the interest of not duplicating code, though, it'd be good to know how far along #625 is. MNASNet is basically just a hyperparameter tweak over MobileNetV2 wrt kernel sizes, layer depths, and block depths. In fact I implemented both using the exact same module.

1e100 · 2019-04-02T03:44:57Z

OK, after some experimentation I got it to train to the following accuracy numbers: loss=1.076, prec@1=73.512, prec@5=91.544. Still not quite paper numbers, but paper numbers seem achievable with more epochs. I'll be putting together a PR later tonight.

1e100 · 2019-04-02T03:46:13Z

FYI: paper number is 74.0% top1.

1e100 · 2019-04-02T08:05:45Z

MNASNet: #829
Trainer: https://github.com/1e100/mnasnet_trainer/tree/master

fmassa · 2019-04-02T09:11:58Z

Awesome, thanks @1e100 !

I'll check your code and integrate it into references/classification later this week

setuc · 2019-04-06T01:16:37Z

@setuc thanks! Did you figure out where the difference was?

@fmassa I tried doing comparison and ran it a couple of more times. Unfortunately, I dont quite have the paper numbers. The best Top-1 error for ShuffleNet v2 is 39.89 compared to 39.70 from the paper. Will that be sufficient for the pull request?

soumith · 2019-04-06T02:13:12Z

@setuc 39.89 vs 39.70 sounds close enough. that would be sufficient for sure.

D-X-Y · 2019-05-05T08:29:58Z

@setuc Would you mind to share your training scripts for ShuffleNet-V2? I tried to use the ResNet training scripts but get a very low accuracy.

stigma0617 · 2019-07-04T08:21:04Z

@fmassa Hi,

Can I upload PR about VoVNet?

The VoVNet was trained in same manners with pytorch/vision style.

To briefly describe VoVNet,

VoVNet is more efficient backbone network than ResNet & DenseNet in terms of GPU-computation and energy.

I implemented VoVNet classification models and maskrcnn-benchmark models.

classification models : https://github.com/stigma0617/VoVNet.pytorch
maskrcnn-benchmark models : https://github.com/stigma0617/maskrcnn-benchmark-vovnet/tree/vovnet

fmassa · 2019-07-04T09:17:37Z

Hi @stigma0617

I think for now it might be better to look into publishing it to torchhub, as it's a very recent paper?

erichhhhho · 2019-08-01T06:48:31Z

@fmassa Hi, may I ask if the resnet101 group norm pretrained on Pytorch is available now?

fmassa · 2019-08-05T20:59:58Z

@erichhhhho not in torchvision, as IIRC it doesn't bring performance improvements over the batch norm version.

edsgerls · 2020-07-27T04:55:06Z

Hi @fmassa

Would it be possible to add the ShuffleNet v2 x1.5 pretrained model, please? I would really appreciate it.

wangg12 · 2021-02-22T01:24:09Z

Would you like to add ResNeSt models?

fmassa added the enhancement label Oct 30, 2018

fmassa self-assigned this Oct 30, 2018

fmassa mentioned this issue Oct 30, 2018

Access to torchvision models training files? #615

Closed

setuc mentioned this issue Feb 4, 2019

Pre-trained models #732

Closed

barrh mentioned this issue Apr 12, 2019

Add ShuffleNet v2 #849

Merged

fmassa mentioned this issue Sep 26, 2019

How about adding more advanced models (mobile net, DPN, etc) in vision ? #543

Closed

[Master Issue] Add more models to torchvision #645

[Master Issue] Add more models to torchvision #645

Comments

fmassa commented Oct 30, 2018 • edited Loading

Requirements

gokriznastic commented Nov 4, 2018 • edited Loading

fmassa commented Nov 6, 2018

JuanFMontesinos commented Dec 5, 2018

fmassa commented Dec 5, 2018

JuanFMontesinos commented Dec 5, 2018

fmassa commented Dec 6, 2018

JuanFMontesinos commented Dec 10, 2018

fmassa commented Dec 11, 2018

guanfuchen commented Dec 11, 2018

fmassa commented Dec 11, 2018

JuanFMontesinos commented Dec 13, 2018

varunagrawal commented Jan 5, 2019

setuc commented Feb 4, 2019

IgorKasianenko commented Feb 4, 2019

fmassa commented Feb 11, 2019

setuc commented Feb 16, 2019

hendrycks commented Feb 16, 2019 • edited Loading

setuc commented Feb 17, 2019 • edited Loading

fmassa commented Feb 18, 2019

setuc commented Feb 18, 2019 • edited Loading

fmassa commented Feb 18, 2019

setuc commented Feb 23, 2019 • edited Loading

setuc commented Feb 26, 2019

ppwwyyxx commented Feb 26, 2019 • edited Loading

fmassa commented Mar 1, 2019

setuc commented Mar 2, 2019 • edited Loading

fmassa commented Mar 6, 2019

hendrycks commented Mar 16, 2019

fmassa commented Mar 18, 2019 • edited Loading

1e100 commented Mar 19, 2019

fmassa commented Mar 22, 2019

1e100 commented Mar 23, 2019

1e100 commented Mar 23, 2019

1e100 commented Apr 2, 2019

1e100 commented Apr 2, 2019

1e100 commented Apr 2, 2019

fmassa commented Apr 2, 2019

setuc commented Apr 6, 2019

soumith commented Apr 6, 2019

D-X-Y commented May 5, 2019

stigma0617 commented Jul 4, 2019 • edited Loading

fmassa commented Jul 4, 2019 • edited Loading

erichhhhho commented Aug 1, 2019

fmassa commented Aug 5, 2019

edsgerls commented Jul 27, 2020

wangg12 commented Feb 22, 2021 • edited Loading

fmassa commented Oct 30, 2018 •

edited

Loading

gokriznastic commented Nov 4, 2018 •

edited

Loading

hendrycks commented Feb 16, 2019 •

edited

Loading

setuc commented Feb 17, 2019 •

edited

Loading

setuc commented Feb 18, 2019 •

edited

Loading

setuc commented Feb 23, 2019 •

edited

Loading

ppwwyyxx commented Feb 26, 2019 •

edited

Loading

setuc commented Mar 2, 2019 •

edited

Loading

fmassa commented Mar 18, 2019 •

edited

Loading

stigma0617 commented Jul 4, 2019 •

edited

Loading

fmassa commented Jul 4, 2019 •

edited

Loading

wangg12 commented Feb 22, 2021 •

edited

Loading