In this tutorial, we will learn how to use Python and MediaPipe to perform real-time face, body, and hand pose detection using a webcam feed. MediaPipe provides pre-trained machine learning models for various tasks like facial landmark detection, hand tracking, and full-body pose estimation.
Prerequisites
Before we begin, ensure you have the following installed:
Python (3.6 or above)
pip package manager
First, let's install the required dependencies:
!pip install mediapipe opencv-python
Now, let's import the necessary libraries:
import mediapipe as mp
import cv2
1. Get Realtime Webcam Feed
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
cv2.imshow('Raw Webcam Feed', frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
This code captures video from your webcam and displays it in real-time. Press 'q' to quit the window.
2. Make Detections from Feed
Detect Facial Landmarks
cap = cv2.VideoCapture(0)
with mp.solutions.face_detection.FaceDetection(min_detection_confidence=0.5) as face_detection:
while cap.isOpened():
ret, frame = cap.read()
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
results = face_detection.process(image)
if results.detections:
for detection in results.detections:
mp_drawing.draw_detection(frame, detection)
cv2.imshow('Raw Webcam Feed', frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
This code detects faces in the webcam feed and draws bounding boxes around them.
Detect Hand Poses and Body Poses
cap = cv2.VideoCapture(0)
with mp.solutions.holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:
while cap.isOpened():
ret, frame = cap.read()
image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
results = holistic.process(image)
if results.face_landmarks:
mp_drawing.draw_landmarks(frame, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)
if results.right_hand_landmarks:
mp_drawing.draw_landmarks(frame, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
if results.left_hand_landmarks:
mp_drawing.draw_landmarks(frame, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)
if results.pose_landmarks:
mp_drawing.draw_landmarks(frame, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)
cv2.imshow('Raw Webcam Feed', frame)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
This code detects hand poses and body poses in the webcam feed and draws landmarks accordingly.
Conclusion
Congratulations! You've learned how to perform real-time face, body, and hand pose detection using Python and MediaPipe. You can further explore these concepts and integrate them into your own projects for various applications like gesture recognition, augmented reality, and more. Feel free to experiment and enhance this tutorial to suit your specific needs. Happy coding!
Top comments (0)