Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML Kit Selfie segmentation model #2

Closed
alvaroschipper opened this issue Apr 19, 2021 · 4 comments · Fixed by #4
Closed

ML Kit Selfie segmentation model #2

alvaroschipper opened this issue Apr 19, 2021 · 4 comments · Fixed by #4

Comments

@alvaroschipper
Copy link

Hi @Volcomix, I am really impressed with your work here! I have been following the work that has been done by multiple parties on this topic but have gotten stuck on the licensing issue surrounding the Google Meet model, until I came across this:
https://developers.google.com/ml-kit/vision/selfie-segmentation

This seems very similar to Google Meet's model at first glance, and when taking a look at the model card, it actually is licensed under Apache 2.0!

Doing some further digging I was able to extract the .tflite file from the Android package without issue:
selfiesegmentation_mlkit-256x256-2021_01_19-v1215.f16.tflite.zip

After that I did an inspection of both the Google Meet model you have in the repo and this new one using https://netron.app/:
Google Meet
segm_full_v679 tflite

ML Kit Selfie segmentation
selfiesegmentation_mlkit-256x256-2021_01_19-v1215 f16 tflite

As you can see their structures are practically identical, the only difference I see is the input and output sizes, the new model using 256x256.

I have tried cloning your repo and loading this model with it by putting it in the public folder with the other models and changing the path used to load it, it seems to load correctly but there is no output visible except a very faint, perhaps stretched outline of the person detected. This leads me to believe using this model with your implementation should be possible, perhaps requiring some adjustment due to the different input/output resolutions (memory offset?), curious to hear your thoughts on this 😄.

@Volcomix
Copy link
Owner

Volcomix commented Apr 22, 2021

Hi @RemarkableGuy, thank you for the kind words and for pointing out the MLKit Selfie segmentation model.
Indeed when you open Google Meet segmentation model in a text editor, you can see something like this:

�����(location //research/aimatter/nnets/tools/conversion/keras2tflite:keras2tflite) --name=keras2tflite_selfiesegmentation_web_256x144-2020_10_05-v679.tflite.generated  (location selfiesegmentation_web_256x144-2020_10_05-v679.hdf5) blaze-out/k8-opt/genfiles/research/aimatter/nnets/models/selfiesegmentation_web_256x144_2020_10_05_v679/keras2tflite_selfiesegmentation_web_256x144-2020_10_05-v679.tflite.generated --remove_softmax���.���selfiesegmentation_web_256x144_2020_10_05_v679������n;������

This makes me think that MLKit Selfie and Google Meet segmentation models have been generated by exporting tflite files from the same Keras model. If by any chance this Keras model is available somewhere and released under Apache 2.0 licence, we should be able to export it as well on our own and to use it. Unfortunately I tried to find it few weeks (or months) ago without any luck. Maybe we could try again nowadays.

I'm pretty confident that we can use the Selfie segmentation model in this repo with few adjustments. I will start a new branch to experiment on this (probably this coming weekend).

In anyways Google Meet model with resolution 256x144 takes around 2x more time than the 160x96 one to infer the segmentation, so I guess the 256x256 one from MLKit Selfie segmentation will be even slower. This should still be fine for desktop browsers though.

@alvaroschipper
Copy link
Author

Hi @Volcomix, no problem! I came to the same conclusion about the origin of these models and indeed it would be very interesting if we had access to that.

Very nice to hear you are interested in creating a branch for this! I'm looking forward to comparing the model's performance characteristics with BodyPix for example, even equal performance to BodyPix but with an improved mask definition would be very valuable to me.

Volcomix added a commit that referenced this issue Apr 24, 2021
@Volcomix
Copy link
Owner

Volcomix commented Apr 24, 2021

Hi @RemarkableGuy, I added ML Kit Selfie Segmentation in the demo. I also added some documentation in the readme. The PR closed the issue but please feel free to give any feedback either in this issue or by creating new ones. Cheers!

@PINTO0309
Copy link

I don't know if it will be useful for you, but I have converted and quantized it for various frameworks and committed it to my repository.

TFLite Float32/Float16/INT8, TFJS, TF-TRT, ONNX, CoreML, OpenVINO IR FP32/FP16, Myriad Inference Blob

https://github.com/PINTO0309/PINTO_model_zoo
https://github.com/PINTO0309/PINTO_model_zoo/tree/main/109_Selfie_Segmentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants