Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert from Depth Pro default 1536x1536 implementation to 1024x1024 float16 tensor CoreML packages #45

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

harism
Copy link

@harism harism commented Oct 16, 2024

*This PR is not meant to be merged currently*

Testing with converting Depth Pro to run on stock Apple MacBook M2 Neural Engine, and to do so convert the model to CoreML packages which can be executed as part of an application.

I can try to provide more suitable implementation if this is something that this project would like to have here on upstream too.

Videos where StyleGAN generates images for Depth Pro;
https://youtu.be/0728BHmhXFc & https://youtu.be/pteYTX9oWz0

@juntaosun
Copy link

juntaosun commented Oct 16, 2024

Is there a piece of code missing ?

depth_pro.py forward : encoder and decoder are not exported

encodings = self.encoder(x)
features, features_0 = self.decoder(encodings)

convert_to_coreml.py:forward

class Depth(nn.Module):
    def __init__(self, head: nn.Module, fov: nn.Module):
        super(Depth, self).__init__()
        self.head = head
        self.fov = fov

    def forward(self, inputs: torch.Tensor) -> torch.Tensor:
        x = inputs[0]

        # How to get this input ? (features  ,  features_0)
        features = inputs[1]
        features_0 = inputs[2]

Can you update the 1024 inference code , Thanks

@harism
Copy link
Author

harism commented Oct 16, 2024

Is there a piece of code missing ?

I'm not exactly sure actually, I've been running this directly in root path python3 convert_to_coreml.py only successfully to generate those CoreML program files for the whole model, after executing the mandatory checkpoint read script and installing dependencies manually. And haven't done much to literally integrate the changes to upstream code yet but rather imported upstream code into this Python script use only.

@harism
Copy link
Author

harism commented Oct 16, 2024

Can you update the 1024 inference code , Thanks

I updated the code so that convert_to_coreml.py execution runs with the example image example.jpg first, shows the 1024x1024 resulting depth map on GUI, then continues to create those CoreML packages.

Hope this helps to see how to continue from here with different sizing options and what not.

@harism harism changed the title Convert from Depth Pro default 1536x1536 size to 1024x1024 float16 tensor CoreML programs Convert from Depth Pro default 1536x1536 implementation to 1024x1024 float16 tensor CoreML programs Oct 16, 2024
@harism harism force-pushed the main branch 2 times, most recently from b8dbdc5 to 499c3e8 Compare October 17, 2024 15:05
@harism harism changed the title Convert from Depth Pro default 1536x1536 implementation to 1024x1024 float16 tensor CoreML programs Convert from Depth Pro default 1536x1536 implementation to 1024x1024 float16 tensor CoreML packages Oct 17, 2024
@harism harism force-pushed the main branch 2 times, most recently from aa18879 to d32719f Compare October 23, 2024 20:35
@charlieforward9
Copy link

I am looking to bind DepthPro into my iOS flutter application. What is the state of this work to enabling that?

@harism
Copy link
Author

harism commented Dec 19, 2024

@charlieforward9 for better optimization I'd recommend to look at DepthAnything very similar work to what DepthPro has, and there are some readymade .mlpackage files available on HuggingFace for it. Those DepthAnything packages are trained with smaller DiNOV2 model optimizing them much better than I've reached here with decreasing the resolution only. I did some DiNOV2 model change trying too but unfortunately wasn't able to reach anything too much working with this.

@charlieforward9
Copy link

Since having the model available as a tflite file is sufficient for me, I opened #79 and plan to run the conversion soon.

Are there any considerations I should make you'd like to warn me of? This is my first time doing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants