google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

Home Page:https://ai.google.dev/edge/mediapipe

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Jitter occuring when running mediapipe on 2 videocamera streams

mamoonik opened this issue · comments

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

OS Name: Microsoft Windows 11 Pro OS | Version:0.0.22631 N/A Build 22631

Mobile device if the issue happens on mobile device

No response

Browser and version if the issue happens on browser

No response

Programming Language and version

Python 3.10.0

MediaPipe version

0.10.14

Bazel version

bazel 7.2.1

Solution

Pose

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

Virtual Studio

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

INTENSE JITTER when running the pose estimation on 2 video streams. Working VERY WELL when running it on just one stream!

Describe the expected behaviour

smooth pose landmarks on two video streams

Standalone code/steps you may have used to try to get what you need

I was using mediapipe for pose estimation on two camera video streams which were getting processed in the same while loop. It was working fine and the post output was quite good for both streams. I had mediapipe 0.8.14 installed on my compute since past 2 years. I accidently updated the mediapipe solution and since then my pose estimation output on both streams is FULL of jitters. I am not getting good ouput on either of the streams. When I test the pose output on one vide stream it works very well!

However when I run it on 2 video streams. It is a disaster. 

I am using exactly same cameras for both feeds. Previously I have used mediapipe==0.8.14 for processing upto 6 frames simultaneously within one while loop and they worked amazing. So this is a wierd and distressing change that 0.10.14 is not working with even 2 streams

Can anyone help with this?

How can I solve this issue using the mediapipe==0.10.14

If not possible, how can I restore mediapipe==0.8.14?

Other info / Complete Logs

WORKS VERY WELL ON 1 VideoStream:
import cv2
import mediapipe as mp

# Initialize MediaPipe Pose
mp_pose = mp.solutions.pose
pose= mp_pose.Pose(static_image_mode=False, min_detection_confidence=0.8,min_tracking_confidence = 0.8 ,model_complexity=2, smooth_landmarks =True)

# Initialize MediaPipe drawing utilities
mp_drawing = mp.solutions.drawing_utils

# Open video capture object for the camera
cap = cv2.VideoCapture(0)  # Camera index (0 for the default camera)

# Set the desired FPS (e.g., 30 FPS)
desired_fps = 90
cap.set(cv2.CAP_PROP_FPS, desired_fps)

# # Set the desired resolution (e.g., 640x480)
# desired_width = 920
# desired_height = 720
# cap.set(cv2.CAP_PROP_FRAME_WIDTH, desired_width)
# cap.set(cv2.CAP_PROP_FRAME_HEIGHT, desired_height)

while True:
    # Capture frame from the camera
    ret, frame = cap.read()

    if not ret:
        break

    # Convert frame to RGB (MediaPipe expects RGB images)
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    # Process the frame with MediaPipe Pose
    results = pose.process(rgb_frame)

    # Draw pose landmarks on the frame
    if results.pose_landmarks:
        mp_drawing.draw_landmarks(frame, results.pose_landmarks, mp_pose.POSE_CONNECTIONS)

    # Display the resulting frame
    cv2.imshow('Pose Estimation', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()
pose.close()






DISASTER ON THIS CODE (2 Video Streams):
import cv2
import numpy as np
import mediapipe as mp

# Initialize MediaPipe Pose
mp_pose = mp.solutions.pose
pose = mp_pose.Pose()

# Initialize MediaPipe drawing utilities
mp_drawing = mp.solutions.drawing_utils

# Open two video capture objects for the two cameras
cap1 = cv2.VideoCapture(0)  # First camera
cap2 = cv2.VideoCapture(1)  # Second camera

while True:
    # Capture frames from both cameras
    ret1, frame1 = cap1.read()
    ret2, frame2 = cap2.read()

    if not ret1 or not ret2:
        break

    # Process the first camera frame
    rgb_frame1 = cv2.cvtColor(frame1, cv2.COLOR_BGR2RGB)
    results1 = pose.process(rgb_frame1)
    if results1.pose_landmarks:
        mp_drawing.draw_landmarks(frame1, results1.pose_landmarks, mp_pose.POSE_CONNECTIONS)

    # Process the second camera frame
    rgb_frame2 = cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB)
    results2 = pose.process(rgb_frame2)
    if results2.pose_landmarks:
        mp_drawing.draw_landmarks(frame2, results2.pose_landmarks, mp_pose.POSE_CONNECTIONS)

    # Concatenate frames horizontally for display
    combined_frame = np.hstack((frame1, frame2))

    # Display the resulting frame
    cv2.imshow('Pose Estimation (Camera 1 & 2)', combined_frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap1.release()
cap2.release()
cv2.destroyAllWindows()
pose.close()

Also, you could previously throw videos into mediapipe using this syntax:

camera_video_0 = cv2.VideoCapture('output_30FPS_XVID.avi')

camera_video_1 = cv2.VideoCapture('output.avi')

you can NOT do that with mediapipe==0.10.14

Need some help with this! This is causing a project to be held up so I woulf truly appreciate any help anykne can provide. I tried version 0.10.7 and that is a little better but no where near 0.8 series

Hi @mamoonik,

After reviewing the provided standalone code, it appears that you are currently using the outdated pose solution, which is no longer maintained, and we have ceased support for it. This functionality has been integrated into the new Pose Landmarker Task API, detailed here.

We encourage you to explore the features of our updated Pose Landmarker Task API and suggest replacing the legacy Pose with the new Pose Landmarker. The new solution offers improved performance and additional functionality compared to the legacy pose solution. You can find the guide for the new Pose Landmarker here, along with specific instructions for implementation in the Python platform provided here. Additionally, a corresponding example Colab notebook is available for reference here.

Please report any observed behavior, ensuring to check if similar issues persist in the upgraded task API. Unfortunately, beyond this, there is limited action we can take to address the specific issue you are facing.

Thank you!!

Hi

Thank you so much!!!

I am using # left_ears_1 = landmarks_0[mp.solutions.pose.Landmark.LEFT_EAR.value]
to access each landmark.

However this is giving me the pose landmarks in normalized coordinate systems.

HOW CAN I ACCESS THE COORDINATES IN IMAGE PIXEL COORDINATES SYSTEM?

Please help on this as soon as possible! I would be very grateful

Hi @mamoonik,

Support for the legacy solution has ended. Please upgrade to our new Pose Task API using the links provided in the comment above. If you still experience the same behavior, let us know. Unfortunately, we can no longer assist with the legacy pose solution. Hope you can understand this.

Thank you!!

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

This issue was closed due to lack of activity after being marked stale for past 7 days.

Are you satisfied with the resolution of your issue?
Yes
No