Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usage of GpuBuffer with WebCamTexture / support for iOS Metal API #768

Open
Paxios opened this issue Oct 15, 2022 · 27 comments
Open

Usage of GpuBuffer with WebCamTexture / support for iOS Metal API #768

Paxios opened this issue Oct 15, 2022 · 27 comments
Labels
type:support Support issue

Comments

@Paxios
Copy link

Paxios commented Oct 15, 2022

Plugin Version or Commit ID

v0.10.1

Unity Version

2022.2

Your Host OS

macOS Monterey 12.6

Target Platform

Android, iOS

Description

1. Is it possible to use GpuBuffer with WebCamTexture?

I tried creating GpuBuffer with Texture.GetNativeTexturePtr, It compiled fine and the app never crashed. TryGetNext is never true, feels like the result is never provided. There is also no exception thrown, I had Logcat is attached to the device with mediapipe dbg build. I used https://github.com/homuler/MediaPipeUnityPlugin/wiki/API-Overview#gpubuffer ,

[AOT.MonoPInvokeCallback(typeof(GlCalculatorHelper.NativeGlStatusFunction))]
static IntPtr PushInputInGlContext() {
try {
var glContext = GlContext.GetCurrent();
var glTextureBuffer = new GlTextureBuffer((UInt32)currentTextureName, currentTextureFrame.width, currentTextureFrame.height,
currentTextureFrame.gpuBufferformat, currentTextureFrame.OnRelease, glContext);
var gpuBuffer = new GpuBuffer(glTextureBuffer);
// TODO: ensure the returned status won't be garbage collected prematurely.
return graph.AddPacketToInputStream(inputStream, new GpuBufferPacket(gpuBuffer, currentTimestamp)).mpPtr;
} catch (Exception e) {
return Status.FailedPrecondition(e.ToString()).mpPtr;
}
}
and
var glTextureName = textureFrame.GetNativeTexturePtr();
return gpuHelper.RunInGlContext(() => {
var glContext = GlContext.GetCurrent();
var glTextureBuffer = new GlTextureBuffer((UInt32)glTextureName, textureFrame.width, textureFrame.height,
textureFrame.gpuBufferformat, textureFrame.OnRelease, glContext);
var gpuBuffer = new GpuBuffer(glTextureBuffer);
return graph.AddPacketToInputStream(inputStream, new GpuBufferPacket(gpuBuffer, timestamp));
});
#13 as a reference.

I saw that there are different texture formats (SRGBA from WebCamTexture and BGRA for GpuBuffer), could this be the problem?

// TODO: When using GpuBuffer, MediaPipe assumes that the input format is BGRA, so the following code must be fixed.

My goal is to skip copying data from GPU to CPU and then passing it back to GPU (MP). It is a huge bottleneck on older devices. I also tried reading the texture of WebCamTexture from a thread, but it can only be read from the main thread.
Calculator I used:
https://github.com/homuler/MediaPipeUnityPlugin/blob/6b8c6743f23539f7604e74dc260b01e0f58f1707/Assets/MediaPipeUnity/Samples/Scenes/Pose%20Tracking/pose_tracking_opengles.txt

2. Support for Metal API

I looked around the repo, but I couldn't figure it out, if there is a support for Metal API?
If not, is it planned to be added in the future?

Code to Reproduce the issue

No response

Additional Context

No response

@Paxios Paxios added the type:support Support issue label Oct 15, 2022
@homuler
Copy link
Owner

homuler commented Oct 16, 2022

Please don't leave the Code to Reproduce the issue field blank.

My goal is to skip copying data from GPU to CPU and then passing it back to GPU (MP). It is a huge bottleneck on older devices.

DemoGraph is a very old implementation. See #435 (comment) instead.

I looked around the repo, but I couldn't figure it out, if there is a support for Metal API?
If not, is it planned to be added in the future?

What kind of support do you expect?
At least, you can use it as the graphics API.

@Paxios
Copy link
Author

Paxios commented Oct 16, 2022

Sorry for a late response, I was trying to implement your comment.

GpuBuffer

In the Estimator class, I added 2 logs, so it's visible which part of the code still gets executed and which does not. There is no exception thrown or anything. I never receive a result from the graph. Is there something I'm doing wrong?

Code

Below is the relevant code:
CameraManager:

var selectedWebCam = WebCamTexture.devices[0];
WebCamTexture = new WebCamTexture(selectedWebCam.name, requestedHeight: 160, requestedWidth: 160);
WebCamTexture.Play();

EstimationManager (MonoBehaviour):

private void Update(){
Estimator.MakeEstimation(CameraManager.WebCamTexture.width, CameraManager.WebCamTexture.height, CameraManager.WebCamTexture);
}

Estimator:

public void MakeEstimation(int width, int height, WebCamTexture texture){
       if (texture == null || width < 100)
                return;
        
        TextureFramePool.ResizeTexture(width, height, TextureFormat.RGBA32);
        if (!TextureFramePool.TryGetTextureFrame(out var textureFrame)) 
                return;

        textureFrame.ReadTextureFromOnGPU(texture);
        //I also tried with TextureFrame#ReadTxtureFromOnCPU(texture)
        var gpuBuffer = textureFrame.BuildGpuBuffer(GpuManager.GlCalculatorHelper.GetGlContext());
        Debug.Log("This is logged");
        Graph.AddPacketToInputStream("input_video", new GpuBufferPacket(gpuBuffer, new Timestamp(currentMicroSeconds))).AssertOk();
        if (_outputLandmarksStream.TryGetNext(out var landmarkList)) {
                    Debug.Log("This is NOT logged");
                    [...]
        }
}

Graph that I use:

input_stream: "input_video"
output_stream: "pose_landmarks"
output_stream: "pose_world_landmarks"
node {
  calculator: "FlowLimiterCalculator"
  input_stream: "input_video"
  input_stream: "FINISHED:pose_landmarks"
  input_stream_info: {
    tag_index: "FINISHED"
    back_edge: true
  }
  output_stream: "throttled_input_video"
}
node: {
  calculator: "ImageTransformationCalculator"
  input_stream: "IMAGE_GPU:throttled_input_video"
  input_side_packet: "ROTATION_DEGREES:input_rotation"
  input_side_packet: "FLIP_HORIZONTALLY:input_horizontally_flipped"
  input_side_packet: "FLIP_VERTICALLY:input_vertically_flipped"
  output_stream: "IMAGE_GPU:transformed_input_video"
}
node {
  calculator: "PoseLandmarkGpu"
  input_stream: "IMAGE:transformed_input_video"
  input_side_packet: "MODEL_COMPLEXITY:model_complexity"
  input_side_packet: "SMOOTH_LANDMARKS:smooth_landmarks"
  input_side_packet: "ENABLE_SEGMENTATION:enable_segmentation"
  input_side_packet: "SMOOTH_SEGMENTATION:smooth_segmentation"
  output_stream: "LANDMARKS:pose_landmarks"
  output_stream: "WORLD_LANDMARKS:pose_world_landmarks"
}

Working case

If I use TextureFromCamera.SetPixels32(WebCamTexture.GetPixels32()); and create new ImageFrame from it with:

var imageFrame = new ImageFrame(ImageFormat.Types.Format.Srgba, TextureFromCamera.width, TextureFromCamera.height, TextureFromCamera.width * 4, TextureFromCamera.GetRawTextureData<byte>());

then pass this ImageFrame to the graph it works (obviously I change the graph to expect ImageFrame instead of GpuBuffer).

Metal

For Metal support I meant, if it's possible to pass Metal's pointer to MediaPipe like we're doing with GLES. Reason for tihs is that we do not want to pass texture from GPU to CPU and back to GPU.

@homuler
Copy link
Owner

homuler commented Oct 16, 2022

textureFrame.ReadTextureFromOnGPU(texture);

What is the return value? (if it returned false, it means it failed)

For Metal support I meant, if it's possible to pass Metal's pointer to MediaPipe like we're doing with GLES. Reason for tihs is that we do not want to pass texture from GPU to CPU and back to GPU.

I am aware that this problem exists, and I'd like to implement the feature if I had unlimited time, but it's not really a high priority because I'm not sure if it really improves the plugin's performance (if the sample app runs at 60fps and the inference step takes less than 1/60 sec, it may not make many differences if any).
If you can demonstrate that there's really a performance hit in that area (e.g. it performs worse than the official iOS sample app), I think the priority will be high.

@Paxios
Copy link
Author

Paxios commented Oct 16, 2022

textureFrame.ReadTextureFromOnGPU(texture); returns true.

I haven't yet tried the app on newer iOS devices, so it's possible that Metal support for this won't be possible as you said 😄.

@Paxios
Copy link
Author

Paxios commented Oct 16, 2022

Do you maybe have/use some community channel like discord group?

@homuler
Copy link
Owner

homuler commented Oct 16, 2022

textureFrame.ReadTextureFromOnGPU(texture); returns true.

Hmm, I don't know. On my Android device, it certainly works when I applied the below patch.

diff --git a/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs b/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
index 813c66a..a6f4322 100644
--- a/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
+++ b/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs
@@ -76,7 +76,7 @@ namespace Mediapipe.Unity
 
       if (textureType == typeof(WebCamTexture))
       {
-        textureFrame.ReadTextureFromOnCPU((WebCamTexture)sourceTexture);
+        textureFrame.ReadTextureFromOnGPU((WebCamTexture)sourceTexture);
       }
       else if (textureType == typeof(Texture2D))
       {

The possible reasons I can come up with are:

  • You've not set OpenGL ES as the graphics API.
  • Your device's WebCamTexture format is not ARGB32 and the channels of the converted image are not aligned as MediaPipe expects.
  • OpenGL ES contexts are not shared with MediaPipe.

so it's possible that Metal support for this won't be possible as you said

To be precise, supporting Metal itself is possible, but it's not a high priority for now.

Do you maybe have/use some community channel like discord group?

No, I don't.

@Paxios
Copy link
Author

Paxios commented Oct 16, 2022

Settings are shown below, should I maybe set any of the ES version to be required?
image

I will experiment with WebCamTexture's format and gles context.

@Paxios
Copy link
Author

Paxios commented Oct 16, 2022

By the way, TextureFormat of the camera is "R8G8B8A8_UNorm". So I guess this could be the problem, since it's RGBA32 instead of ARGB32?

Also a note, that ReadTextureFromOnCPU doesn't work either, so I guess there's something wrong with my implementation 😄

@homuler
Copy link
Owner

homuler commented Oct 16, 2022

Settings are shown below, should I maybe set any of the ES version to be required?

OpenGL ES 3.2 is required to share the context with MediaPipe (that's why even ReadTextureFromOnCPU doesn't work).
I strongly recommend you first modify and test the sample app before writing your own code.

@Paxios
Copy link
Author

Paxios commented Oct 18, 2022

Hey there once more,
I looked into the code more deeply than before and I can't find the usage of ReadTextureFromOnGPU in the sample app. Is it because of the high latency, you mentioned in the comment?
If I switch from ReadTextureFromOnCPU to ReadTextureFromOnGPU in Solution class, no estimations are returned from the graph (as it happens in my app).

I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.

Usage of ReadTextureFromOnCPU

https://github.com/homuler/MediaPipeUnityPlugin/blob/master/Assets/MediaPipeUnity/Samples/Common/Scripts/Solution.cs#L69-L89

I tested it on 2 GPUs ARM Mali-G71 MP20 and Xclipse 920, which both support GL ES 3.2

@homuler
Copy link
Owner

homuler commented Oct 18, 2022

I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.

I confirmed (see #768 (comment)).

@homuler
Copy link
Owner

homuler commented Oct 18, 2022

I think you should display the target texture (after calling ReadTextureFromOnGPU) on the screen first.

@Paxios
Copy link
Author

Paxios commented Oct 18, 2022

I would very much appreciate it, if you could please verify that ReadTextureFromOnGPU works on your side.

I confirmed (see #768 (comment)).

Mind if I ask which device you tried it on or which GPU it uses ?

@homuler
Copy link
Owner

homuler commented Oct 18, 2022

Pixel 6.
At any rate, I think you should check if the pixel data is actually copied on GPU (cf. #768 (comment)).

@Paxios
Copy link
Author

Paxios commented Oct 18, 2022

Sorry for a late response.

textureFrame.ReadTextureFromOnGPU(texture);
texture2DToDisplay.SetPixels32(textureFrame.GetPixels32());
texture2DToDisplay.Apply();
RawImage.texture = texture2DToDisplay;

texture is WebCamTexture. This does actually display the correct image in the RawImage on the screen.

The following is the process of applying TextureFrame to the graph.

var gpuBuffer = textureFrame.BuildGpuBuffer(GpuManager.GlCalculatorHelper.GetGlContext());
            _graph.AddPacketToInputStream("input_video", new GpuBufferPacket(gpuBuffer, new Timestamp(currentMicroSeconds))).AssertOk();

Output processing:

_outputLandmarksStream = new OutputStream<LandmarkListPacket, LandmarkList>(_graph, OutputNodeName);
_outputLandmarksStream.AddListener(OnPoseLandmarksOutput);

private void OnPoseWorldLandmarksOutput(object stream, OutputEventArgs<LandmarkList> eventArgs) {
            if (eventArgs.value != null) {
                ... 
                I use eventArgs.value in here
                ...
            }
        }

This code works if I change ReadTextureFromOnGPU to ReadTextureFromOnCPU.

@homuler
Copy link
Owner

homuler commented Oct 18, 2022

Which log is output on your device? (maybe you need to build your apk with Development Build checked).

#if UNITY_ANDROID
if (_CurrentContext == IntPtr.Zero)
{
Logger.LogWarning(_TAG, "EGL context is not found, so MediaPipe won't share their EGL contexts with Unity");
}
else
{
Logger.LogVerbose(_TAG, $"EGL context is found: {_CurrentContext}");
}
#endif

This does actually display the correct image in the RawImage on the screen.

Hmm, interesting. I guess this code wouldn't work on my Pixel 6 rather (I need to do RawImage.texture = textureFrame._texture).
In general, Graphics.CopyTexture only works on GPU. When I used Unity 2020.3.x, the data on CPU had been invalidated after calling Graphics.CopyTexture (cf. https://forum.unity.com/threads/graphics-copytexture-then-getpixels.482601/) (I've not tested it yet with Unity 2021.3.x),

@Paxios
Copy link
Author

Paxios commented Oct 19, 2022

Hey,
I just tried modifying your code as follows:

      if (textureType == typeof(WebCamTexture))
      {
        textureFrame.ReadTextureFromOnCPU((WebCamTexture)sourceTexture);
        if(textureFrame._texture != null)
          rawImage.texture = textureFrame._texture;
      }

This does show the camera preview on the screen. But if I use ReadTextureFromOnGPU it doesn't. So I guess there's some problem setting _texture in TextureFrame. ReadTextureFromOnGPU returns true, so I don't know what would cause this.

Which log is output on your device? (maybe you need to build your apk with Development Build checked).

Output is the following: Unity GpuManager: EGL context is found: 511835274624

@Paxios
Copy link
Author

Paxios commented Oct 19, 2022

Some additional information:
I'm using Unity 2022.2.0 (as per your advice in #760).
ReadTextureFromOnGPU uses Graphics.CopyTexture.

I checked additional data in ReadTextureFromOnGPU
srcFormat (WebCamTexure): R8G8B8A8_UNorm
thisFormat (Texture2D): RGBA32
Width & Height match on both
SystemInfo.copyTextureSupport returns: Basic, Copy3D, DifferentTypes, TextureToRT, RTToTexture

I have also tested your sample app on Pixel 6 with ReadTextureFromOnGPU set and it's the same outcome.

@Paxios
Copy link
Author

Paxios commented Oct 19, 2022

I managed to find out the cause of this issue 😅

// Use RGBA32 as the input format.
// TODO: When using GpuBuffer, MediaPipe assumes that the input format is BGRA, so the following code must be fixed.
textureFramePool.ResizeTexture(imageSource.textureWidth, imageSource.textureHeight, TextureFormat.RGBA32);

I had to change the format of the pool from TextureFormat.RGBA32 to TextureFormat.ARGB32. I think there's a typo in the comment :)

Also this works on GLES 3.1, so 3.2 is not mandatory

@homuler
Copy link
Owner

homuler commented Oct 19, 2022

I had to change the format of the pool from TextureFormat.RGBA32 to TextureFormat.ARGB32. I think there's a typo in the comment :)

So it seems that the cause was:

Your device's WebCamTexture format is not ARGB32 and the channels of the converted image are not aligned as MediaPipe expects.

The following comment is not a typo (cf. https://github.com/google/mediapipe/blob/7a6ae97a0ef298014aaf5e1370cb6f8237f2ac21/mediapipe/gpu/gpu_buffer_format.cc#L64-L78).

// When using GpuBuffer, MediaPipe assumes that the input format is BGRA, so the following code must be fixed.

However, at least in Unity, this assumption does not always hold (the input format can be RGBA or ARGB, etc...).
Currently, this issue can be avoided by changing the texture format as you did (but it's not intuitive which format should be used).

Also this works on GLES 3.1, so 3.2 is not mandatory

Indeed, I was wrong about this, and it seems that OpenGL ES 3.2 is not required to create a context.
https://github.com/google/mediapipe/blob/7a6ae97a0ef298014aaf5e1370cb6f8237f2ac21/mediapipe/gpu/gl_context_egl.cc#L110-L171

@Paxios
Copy link
Author

Paxios commented Oct 20, 2022

So it seems that the cause was:

Yes, that was the cause :)

The following comment is not a typo

Ah okay 👍🏼, wasn't sure.

Is there any way to not block the CPU while TextureFramePool executes TextureFrame#WaitUntilReleased?


/// <summary>
/// Waits until the GPU has executed all commands up to the sync point.
/// This blocks the CPU, and ensures the commands are complete from the point of view of all threads and contexts.
/// </summary>
public void WaitUntilReleased()
{
if (_glSyncToken == null)
{
return;
}
_glSyncToken.Wait();
_glSyncToken.Dispose();
_glSyncToken = null;
}

Or will this unsync the GPU & CPU and result in uncontrollable crashes?
I'd like to achieve relatively smooth performance on old devices (e.g. Samsung Galaxy J7). Currently this on average causes 90-100 ms of lag.

@homuler
Copy link
Owner

homuler commented Oct 26, 2022

Currently this on average causes 90-100 ms of lag.

Do you mean _glSyncToken.Wait() takes 90-100ms?
If so, how did you measure it?

@Paxios
Copy link
Author

Paxios commented Oct 26, 2022

Currently this on average causes 90-100 ms of lag.

Do you mean _glSyncToken.Wait() takes 90-100ms? If so, how did you measure it?

Yes, that's correct. With Unity deep profiling.

image

It's far less on modern devices (10-25ms)

@homuler
Copy link
Owner

homuler commented Oct 26, 2022

Does changing TextureFramePool._poolSize (e.g. 20) make any differences?

@Paxios
Copy link
Author

Paxios commented Oct 26, 2022

No, not at all.
It just delays it a bit. I once set it to 10000 (changed the GlobalInstanceTable size) and that delay, didn't happen. But the game crashed. I guess because not enough resources were available for 10k textures.

@tealm
Copy link

tealm commented Nov 3, 2022

I am also seeing very slow results on my device (Android Galaxy 04, built with SDK28 and minVersion set to OpenGL ES 3.1).
Do you have any ideas why the latency is high when copied to GPU as stated in your comment?
// For some reason, when the image is coiped on GPU, latency tends to be high.

Profiler screenshot from running the Hands tracking sample on Android shows that it does take time to read image from GPU.
image
image

@dayowoo
Copy link

dayowoo commented Aug 16, 2023

@tealm
I have a same question. Do you get the answer?...
Thank you for checking my message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:support Support issue
Projects
None yet
Development

No branches or pull requests

4 participants