Implementation of MLPMixer #103

theabhirath · 2022-01-29T05:36:13Z

This is an implementation of one of the many models in the wake of the ViT explosion, MLPMixer.

There's two things I wanted to clarify:

I added TensorCast as a dep because einops like operations are very commonplace in ViT model implementations and I thought it would be easier to work with that. If you think that I should change that up somehow or maybe just use standard array operations, lemme know and I'll revert - it's very painful though, I tried already 🥲
I also made some changes to the organisation of the repo (might be a bit of an understatement) because I thought it would be better to separate out the CNN models and the ViT models (I have some others in the works) but lemme know if you think it's overkill and I'll revert to the original repo structure.

darsnack · 2022-01-29T13:24:33Z

Re-organizing the repo into CNNs and ViTs makes sense, but I would avoid introducing submodules. I would just have two folders and put all the includes in Metalhead.jl (grouped by type to make it look nice).

I like TensorCast.jl, but I want to avoid taking on a dep if we can. I'll have to go through and see if we can rewrite using standards ops without being too messy.

Now, we got ViTs, exciting!

darsnack · 2022-01-29T13:25:15Z

I forgot to ask this on the last PR, but can you update the table in the README with the models from the previous PRs + this one?

CarloLucibello · 2022-01-29T14:20:49Z

I don't think MLPMixer can be classified as a visual transformer, but maybe we could argue that this separation is fine since the spirit is the same

theabhirath · 2022-01-29T14:32:32Z

Re-organizing the repo into CNNs and ViTs makes sense, but I would avoid introducing submodules. I would just have two folders and put all the includes in Metalhead.jl (grouped by type to make it look nice).

Will do.

I like TensorCast.jl, but I want to avoid taking on a dep if we can. I'll have to go through and see if we can rewrite using standards ops without being too messy.

Yeah the MLPMixer one isn't that bad, it's just two places, but for the standard ViT and for other versions like LeViT and Swin it gets progressively worse.

I forgot to ask this on the last PR, but can you update the table in the README with the models from the previous PRs + this one?

Will do, I wasn't sure whether or not to do it because it wasn't tagged as a release yet xD.

I don't think MLPMixer can be classified as a visual transformer, but maybe we could argue that this separation is fine since the spirit is the same

Yeah they share quite a few things with the patches and embeddings design, and the original JAX repository released the code together so I think it's better to keep them together

darsnack · 2022-01-29T14:37:48Z

Will do, I wasn't sure whether or not to do it because it wasn't tagged as a release yet xD.

Only the main branch README will change. The docs will still reflect the tagged version.

I don't think MLPMixer can be classified as a visual transformer, but maybe we could argue that this separation is fine since the spirit is the same

True, but it doesn't really fit any category. Once the submodules are removed, this distinction will only affect our code organization and not the user interfaces.

Updated README

theabhirath · 2022-01-29T15:13:36Z

That's odd, the CI seems to be failing only on Linux and I don't see why...the error message says the file doesn't exist but it very much does 🤨

theabhirath · 2022-01-29T15:57:32Z

Yeah some sort of weird renaming issue - I'd done it on the local but it didn't seem to have been pushed to the remote somehow. Fixed now

darsnack · 2022-01-29T16:53:50Z

Can you rebase with the latest changes? Shouldn't affect your work here. Also, I'll fix the README cause I need to fix the docs CI too.

Updated README

…into mlpmixer

theabhirath · 2022-01-29T17:14:31Z

Yep, should be fine now. Checked the tests locally and they worked alright

ToucheSir · 2022-01-29T17:46:22Z

Bikeshedding the folder structure, I don't think there needs to be a premature assignment of MLP Mixer into a specific category (e.g. where would DeiT fit in under this scheme?). Keeping it at the top level or in an "other" directory would be fine. If/when we get attention-based models, those can get their own dir.

Updated README

…into mlpmixer

theabhirath · 2022-01-30T00:44:07Z

Bikeshedding the folder structure, I don't think there needs to be a premature assignment of MLP Mixer into a specific category (e.g. where would DeiT fit in under this scheme?). Keeping it at the top level or in an "other" directory would be fine. If/when we get attention-based models, those can get their own dir.

I've make the change for MLPMixer to be put in an "other" directory, then. I think DeiT would slot into the ViT folder given that the paper explicitly refers to it as such (FAIR's repo released all their ViT-based models together too) but I get the reason for the MLPMixer contention so I've taken care of that

theabhirath · 2022-02-02T09:35:02Z

Playing around with this model, I realised that the show functionality doesn't work as expected because I've defined a custom model instead of writing the layers with say Chain or SkipConnection, which are already present in Flux and thus likely implement show on their own. Is there an easier way to get it done than having to write it layer by layer for the custom operations? There's also the use of TensorCast, which I imagine will complicate things slightly

darsnack · 2022-02-02T14:39:38Z

I have some suggestions which should resolve that issue. I've just been busy over the last few days, but I'll review both new PRs later this evening.

src/other/mlpmixer.jl

darsnack

I think one major change here would be to avoid making MLPMixer as a struct with a custom forward pass. What I recommend instead is to define a new patching layer (in a separate file) like

struct Patching{T<:Integer}
  patch::T
end

@functor Patching

function (p::Patching)(x)
  h, w, c, n = size(x)
  hp, wp = h ÷ p.patch, w ÷ p.patch
  xpatch = reshape(x, hp, p.patch, wp, p.patch, c, n)

  return reshape(permutedims(xpatch, (1, 3, 5, 2, 4, 6), p.patch^2 * c, hp * wp, n)
end

Then MLPMixer is nothing more than a Chain. Part of the goal with this package's design is to illustrate how Flux's feature make it possible to build advanced models with minimal boilerplate. Reimplementing Chain-like forward passes is something we want to avoid when possible.

Overall, the implementation looks correct; we just need some design iterations. Great job!

src/Metalhead.jl

src/vit-like/mlpmixer.jl

theabhirath · 2022-02-03T04:06:24Z

I think one major change here would be to avoid making MLPMixer as a struct with a custom forward pass. What I recommend instead is to define a new patching layer (in a separate file) like
struct Patching{T<:Integer}
  patch::T
end

@functor Patching

function (p::Patching)(x)
  h, w, c, n = size(x)
  hp, wp = h ÷ p.patch, w ÷ p.patch
  xpatch = reshape(x, hp, p.patch, wp, p.patch, c, n)

  return reshape(permutedims(xpatch, (1, 3, 5, 2, 4, 6), p.patch^2 * c, hp * wp, n)
end
Then MLPMixer is nothing more than a Chain. Part of the goal with this package's design is to illustrate how Flux's feature make it possible to build advanced models with minimal boilerplate. Reimplementing Chain-like forward passes is something we want to avoid when possible.

Will do this, but if you don't mind can I keep the TensorCast dep? I will remove it if there's a clear reason, but einops notation has become so ubiquitous across model implementations in Python frameworks that I thought it would be a lot more intuitive for it to be the same way here

darsnack · 2022-02-03T04:32:58Z

Personally, I find the einsum notation more confusing than the plain Julia, but of course, that's just my subjective opinion.

Mainly, I'm hesitant to take on another dependency just to mimic Python. Especially since it seems to only serve indexing operations that Julia does quite well with reshape. If it turns out that ViT can be made faster by using TensorCast, then that seems like a good reason to take on the dep. Maybe other contributors can weigh in (cc @ToucheSir).

ToucheSir · 2022-02-03T04:39:21Z

There are a few einsum-related packages, so if we do adopt one it would ideally be relatively lightweight + performant (flexibility is not an issue since Python frameworks don't have much in their einsum implementations). Because that could take a minute, I vote to defer that discussion to the ViT PR since this one doesn't require any einsum ops. That will let us merge this one asap :)

darsnack · 2022-02-03T04:41:56Z

Okay @theabhirath does that sound like a reasonable plan? Use the non-einsum implementation here for now. Since this model is mostly there, this will let us merge without much back and forth.

If we decide to take on an einsum dep in the ViT PR, we can update this model's code as well.

theabhirath · 2022-02-03T04:47:54Z

That makes sense to me. I'll make the changes 👍🏽

darsnack

Almost done. A few minor changes and we can merge.

src/other/mlpmixer.jl

darsnack

Looks good! There were a few things I missed on the last pass, sorry.

src/Metalhead.jl

src/other/mlpmixer.jl

test/other.jl

darsnack · 2022-02-03T17:40:11Z

Also, FYI for the future: use git rebase instead of git merge when pulling in upstream changes to your PR branch. git merge will show all the upstream changes in the diff, making it hard to parse what's actually changed vs upstream and what's just showing as changed even though it's the same.

src/Metalhead.jl

src/other/mlpmixer.jl

theabhirath · 2022-02-04T03:01:01Z

Thank you so much! The formatting is still a pain 😅 Some sort of auto-formatter really needs to be around to ensure this kind of thing doesn't sneak under the radar

darsnack · 2022-02-04T03:14:40Z

Yeah I agree. There's a PR for Flux on this. I'm just waiting for that to settle on a style choice which I can duplicate here.

Initial commit for MLP mixer

42726a1

Updated directory structure

0ad2db9

Updated README

darsnack closed this Jan 29, 2022

darsnack reopened this Jan 29, 2022

Rename test/ConvNets.jl to test/convnets.jl

e2a3e4a

theabhirath force-pushed the mlpmixer branch from bb032b7 to e2a3e4a Compare January 29, 2022 15:40

theabhirath added 5 commits January 29, 2022 22:29

Initial commit for MLP mixer

f4e71b9

Updated directory structure

4a3d82f

Updated README

Rename test/ConvNets.jl to test/convnets.jl

4116add

Merge branch 'mlpmixer' of https://github.com/theabhirath/Metalhead.jl …

0402ad9

…into mlpmixer

Update runtests.jl

3c61e4b

theabhirath added 9 commits January 30, 2022 05:56

Initial commit for MLP mixer

3b58716

Updated directory structure

7a72ef4

Updated README

Rename test/ConvNets.jl to test/convnets.jl

bbf0cdf

Initial commit for MLP mixer

bb49697

Updated directory structure

b5382dd

Updated README

Rename test/ConvNets.jl to test/convnets.jl

de7ebf1

Update runtests.jl

d2ff28b

Updated MLPMixer category

624c539

Merge branch 'mlpmixer' of https://github.com/theabhirath/Metalhead.jl …

7adccec

…into mlpmixer

Clean up files

d7933b1

theabhirath mentioned this pull request Feb 2, 2022

Implementation of base ViT model #105

Merged

DhairyaLGandhi reviewed Feb 2, 2022

View reviewed changes

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

Trimmed struct definition for MLPMixer model

ef6030c

darsnack requested changes Feb 3, 2022

View reviewed changes

src/Metalhead.jl Outdated Show resolved Hide resolved

src/vit-like/mlpmixer.jl Outdated Show resolved Hide resolved

src/vit-like/mlpmixer.jl Outdated Show resolved Hide resolved

theabhirath added 2 commits February 3, 2022 12:09

Cleaned up MLPMixer implementation

3b7e421

Update Metalhead.jl

04c78c6

theabhirath requested a review from darsnack February 3, 2022 07:08

darsnack requested changes Feb 3, 2022

View reviewed changes

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

Cleaned up API for model

eb45dee

theabhirath requested a review from darsnack February 3, 2022 16:46

darsnack requested changes Feb 3, 2022

View reviewed changes

src/Metalhead.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

test/other.jl Outdated Show resolved Hide resolved

Minor formatting tweaks

cabe8aa

theabhirath force-pushed the mlpmixer branch from 329489f to cabe8aa Compare February 4, 2022 00:04

theabhirath requested a review from darsnack February 4, 2022 00:22

darsnack approved these changes Feb 4, 2022

View reviewed changes

src/Metalhead.jl Outdated Show resolved Hide resolved

src/other/mlpmixer.jl Outdated Show resolved Hide resolved

Apply suggestions from code review

44de174

darsnack merged commit 1eb8a51 into FluxML:master Feb 4, 2022

theabhirath deleted the mlpmixer branch February 4, 2022 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of MLPMixer #103

Implementation of MLPMixer #103

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022

darsnack commented Jan 29, 2022

CarloLucibello commented Jan 29, 2022

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022

theabhirath commented Jan 29, 2022

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022 •

edited

Loading

theabhirath commented Jan 29, 2022

ToucheSir commented Jan 29, 2022

theabhirath commented Jan 30, 2022

theabhirath commented Feb 2, 2022

darsnack commented Feb 2, 2022

darsnack left a comment

theabhirath commented Feb 3, 2022

darsnack commented Feb 3, 2022 •

edited

Loading

ToucheSir commented Feb 3, 2022

darsnack commented Feb 3, 2022

theabhirath commented Feb 3, 2022

darsnack left a comment

darsnack left a comment

darsnack commented Feb 3, 2022

theabhirath commented Feb 4, 2022

darsnack commented Feb 4, 2022

Implementation of MLPMixer #103

Implementation of MLPMixer #103

Conversation

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022

darsnack commented Jan 29, 2022

CarloLucibello commented Jan 29, 2022

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022

theabhirath commented Jan 29, 2022

theabhirath commented Jan 29, 2022

darsnack commented Jan 29, 2022 • edited Loading

theabhirath commented Jan 29, 2022

ToucheSir commented Jan 29, 2022

theabhirath commented Jan 30, 2022

theabhirath commented Feb 2, 2022

darsnack commented Feb 2, 2022

darsnack left a comment

Choose a reason for hiding this comment

theabhirath commented Feb 3, 2022

darsnack commented Feb 3, 2022 • edited Loading

ToucheSir commented Feb 3, 2022

darsnack commented Feb 3, 2022

theabhirath commented Feb 3, 2022

darsnack left a comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

darsnack commented Feb 3, 2022

theabhirath commented Feb 4, 2022

darsnack commented Feb 4, 2022

darsnack commented Jan 29, 2022 •

edited

Loading

darsnack commented Feb 3, 2022 •

edited

Loading