face_landmark.py example incorrect color conversion with h264 streams

Question

face_landmark.py example incorrect color conversion with h264 streams

Rajiv91 opened this issue 6 months ago · comments

Hi
I'm not very interested in the media pipe part and the detection i just need to convert the track room frames to something that i can process with opencv and numpy So the closest thing I could find for what I need is the example face_landmark.py in this repo, so i have a repo where i'm pushing a h264 encoded rtsp camera video feed using the webrtc libraries of the pion-lk go sdk, then to confirm i'm pushing well the stream i'm joinning to that room in the react lk test page: https://meet.livekit.io/?tab=custom , so fas so good, the problem begins when i try to use the face_landmark.py python example to join that room and display the frames with the face_landmark example, it seems that there is a color space conversion that does not seem to fit my case well, i already tried with lk 6.0 and it seemed like there was an offset in the color, also i tried with 7.0 and 8.0 and the problems persist, now the video looks very reddish.
Checking the code example It seems that it has to map to argb all the time (from i420), then rgb->bgr. It is okay that it is argb all the time regardless of the source format? I can't find much documentation on why they do those conversions or how I should do them.

This is the code piece i believe is causing some issues with my stream, and I can't find anywhere why the conversion from the stream to numpy and the color conversion is done this way when it originally reported to me that the frame arrives as i420:

    async for frame in video_stream:
        buffer = frame.buffer

        if (
            argb_frame is None
            or argb_frame.width != buffer.width
            or argb_frame.height != buffer.height
        ):
            argb_frame = rtc.ArgbFrame.create(
                rtc.VideoFormatType.FORMAT_ABGR, buffer.width, buffer.height
            )
        buffer.to_argb(argb_frame)
        arr = np.frombuffer(argb_frame.data, dtype=np.uint8)
        arr = arr.reshape((argb_frame.height, argb_frame.width, 4))
        arr = cv2.cvtColor(arr, cv2.COLOR_RGBA2RGB)
        # mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=arr)
        # detection_result = landmarker.detect_for_video(mp_image, frame.timestamp_us)

        # draw_landmarks_on_image(arr, detection_result)

        arr = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR)

If i change the format to FORMAT_RGBA in:

            argb_frame = rtc.ArgbFrame.create(
                rtc.VideoFormatType.FORMAT_RGBA, buffer.width, buffer.height
            )

The conversion looks better (not redish), however there seems to be an offset on the y axis in some of the channels.

Regards

dguerizec · Answer 1 · Mon Jan 22 2024 19:21:21 GMT+0800 (China Standard Time)

+1
I'm experiencing the same effect in H264 test videos

David Zhao · Answer 2 · Thu Jan 25 2024 13:13:27 GMT+0800 (China Standard Time)

We are looking at this one. hope to resolve sometimes this week.

Théo Monnom · Answer 3 · Thu Feb 15 2024 04:56:03 GMT+0800 (China Standard Time)

Hey, this should now be fixed with our new VideoFrame API in rtc-v0.9.0. Please reopen the issue if this isn't the case