AI Face, Body, and Hand Pose Detection with Python and MediaPipe

#ai #machinelearning #python #mediapipe

In this tutorial, we will learn how to use Python and MediaPipe to perform real-time face, body, and hand pose detection using a webcam feed. MediaPipe provides pre-trained machine learning models for various tasks like facial landmark detection, hand tracking, and full-body pose estimation.

Prerequisites

Before we begin, ensure you have the following installed:

Python (3.6 or above)
pip package manager

First, let's install the required dependencies:

!pip install mediapipe opencv-python

Now, let's import the necessary libraries:

import mediapipe as mp
import cv2

1. Get Realtime Webcam Feed

cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    cv2.imshow('Raw Webcam Feed', frame)

    if cv2.waitKey(10) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

This code captures video from your webcam and displays it in real-time. Press 'q' to quit the window.

2. Make Detections from Feed

Detect Facial Landmarks

cap = cv2.VideoCapture(0)

with mp.solutions.face_detection.FaceDetection(min_detection_confidence=0.5) as face_detection:

    while cap.isOpened():
        ret, frame = cap.read()
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = face_detection.process(image)

        if results.detections:
            for detection in results.detections:
                mp_drawing.draw_detection(frame, detection)

        cv2.imshow('Raw Webcam Feed', frame)

        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

This code detects faces in the webcam feed and draws bounding boxes around them.

Detect Hand Poses and Body Poses

cap = cv2.VideoCapture(0)

with mp.solutions.holistic.Holistic(min_detection_confidence=0.5, min_tracking_confidence=0.5) as holistic:

    while cap.isOpened():
        ret, frame = cap.read()
        image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        results = holistic.process(image)

        if results.face_landmarks:
            mp_drawing.draw_landmarks(frame, results.face_landmarks, mp_holistic.FACE_CONNECTIONS)

        if results.right_hand_landmarks:
            mp_drawing.draw_landmarks(frame, results.right_hand_landmarks, mp_holistic.HAND_CONNECTIONS)

        if results.left_hand_landmarks:
            mp_drawing.draw_landmarks(frame, results.left_hand_landmarks, mp_holistic.HAND_CONNECTIONS)

        if results.pose_landmarks:
            mp_drawing.draw_landmarks(frame, results.pose_landmarks, mp_holistic.POSE_CONNECTIONS)

        cv2.imshow('Raw Webcam Feed', frame)

        if cv2.waitKey(10) & 0xFF == ord('q'):
            break

cap.release()
cv2.destroyAllWindows()

This code detects hand poses and body poses in the webcam feed and draws landmarks accordingly.

Conclusion

Congratulations! You've learned how to perform real-time face, body, and hand pose detection using Python and MediaPipe. You can further explore these concepts and integrate them into your own projects for various applications like gesture recognition, augmented reality, and more. Feel free to experiment and enhance this tutorial to suit your specific needs. Happy coding!

DEV Community

AI Face, Body, and Hand Pose Detection with Python and MediaPipe

Prerequisites

1. Get Realtime Webcam Feed

2. Make Detections from Feed

Conclusion

Top comments (0)

Read next

How to Use L298N Motor Driver with Raspberry Pi Pico W

Thoroughly experimented with Fine-Tuning / DreamBooth training of Flux-dev-de-distill, PixelWave v03, Verus Vision

Exploring Java's Role in Cloud Computing and AI for 2024

Thursday Quiz