Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New issue with midas function section #164

Open
GwendalC opened this issue Dec 24, 2022 · 9 comments
Open

New issue with midas function section #164

GwendalC opened this issue Dec 24, 2022 · 9 comments

Comments

@GwendalC
Copy link

GwendalC commented Dec 24, 2022


Hello, could you help me solve this issue?
I restarted the notebook several times, it was working fine up to 1 PM today
A dependency issue?
thanks ! (you're awesome)

in
1 #@title ### 1.4 Define Midas functions
2
----> 3 from midas.dpt_depth import DPTDepthModel
4 from midas.midas_net import MidasNet
5 from midas.midas_net_custom import MidasNet_small

2 frames

/content/MiDaS/midas/backbones/next_vit.py in
6 from .utils import activations, forward_default, get_activation
7
----> 8 file = open("./externals/Next_ViT/classification/nextvit.py", "r")
9 source_code = file.read().replace(" utils", " externals.Next_ViT.classification.utils")
10 exec(source_code)

FileNotFoundError: [Errno 2] No such file or directory: './externals/Next_ViT/classification/nextvit.py'

@GwendalC
Copy link
Author

It seems there has been an update of midas today.

[Dec 2022] Released MiDaS v3.1:
New models based on 5 different types of transformers (BEiT, Swin2, Swin, Next-ViT, LeViT)
Training datasets extended from 10 to 12, including also KITTI and NYU Depth V2 using BTS split
Best model, BEiTLarge 512, with resolution 512x512, is on average about 28% more accurate than MiDaS v3.0
Integrated live depth estimation from camera feed

@GwendalC
Copy link
Author

GwendalC commented Dec 24, 2022

I guess the fix should be something like referring to version 3.0 in the gitclone here in disco.py

580 | try:
581 | from midas.dpt_depth import DPTDepthModel
582 | except:
583 | if not os.path.exists('MiDaS'):
584 | gitclone("https://github.com/isl-org/MiDaS.git")
585 | if not os.path.exists('MiDaS/midas_utils.py'):

@jszgz
Copy link

jszgz commented Dec 25, 2022

same here

@GwendalC
Copy link
Author

GwendalC commented Dec 25, 2022

Cause here: isl-org/MiDaS#193

@xirtus
Copy link

xirtus commented Dec 30, 2022

worst crisis of my life

@StMoelter
Copy link

As long as the fix i not merged, the branch with the fix can be used:
https://colab.research.google.com/github/StMoelter/disco-diffusion/blob/fix%2Fmidas-checkout-v3-tag/Disco_Diffusion.ipynb

@thias15
Copy link

thias15 commented Jan 2, 2023

Hi guys. MiDaS v3.1 is now fixed to make NextViT which was causing the issue optional. So you can use tag v3_1 and also use the latest models with even better performance. For instance you could point to tag v3_1, download the checkpoint from the corresponding release and then define for example the BEiT_L_384 like so:

    if midas_model_type == "beit_l_384":  # BEiT_L_384
        midas_model = DPTDepthModel(
            path=midas_model_path,
            backbone="beitl16_384",
            non_negative=True,
        )
        net_w, net_h = 384, 384
        resize_mode = "minimal"
        normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

@aletts
Copy link
Collaborator

aletts commented Jan 2, 2023

@thias15 Thanks. It could be an interesting thing to try.

However, there's a funny thing about the depth estimation in Disco Diffusion. In many cases, more accurate depth estimation may result in aesthetically worse results.

When I initially tried using MiDaS dpt_large (from v3) alone, I found that in combination with the flow field technique used for the transformation to the next frame, the MiDaS dpt_large depth estimation was already too good. Sharp/defined edges in common content result in exposing undesirable properties of the simple flow field approach (I also had an experimental better technique prior to the DD v5 release and didn't initially include it since it was complicated and I thought people wouldn't know why I'd done it.. and then I lost the code and haven't prioritized doing it again). I quickly improved its aesthetics by introducing a weighted blend from the AdaBins output. I suspect that with the flow field approach unchanged, most would get better results by increasing the AdaBins contribution further.

@thias15
Copy link

thias15 commented Jan 4, 2023

@aletts interesting! Note that we have introduced several new models in release v3.1 leveraging different backbones with various trade-offs between accuracy and speed, e.g. Swin-L, SwinV2-T, SwinV2-B, SwinV2-L, LeViT, BEiT-L, etc. Might be interesting to try different variants to see how nicely they play with the flow field approach. By the way, what exactly is the flow field approach used for and how does it work? On a side note, we will also release a new depth estimation model in the near future that essentially combines AdaBins and MiDaS, so stay tuned for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants