Victor Isaac Oshimua

Posted on Aug 29

How to Deploy Segment Anything Model 2 (SAM 2) With Modelbit

Building on the success of the Segment Anything Model (SAM), Meta has released an upgraded version called the Segment Anything Model 2 (SAM 2).

SAM 2 is a computer vision model designed to identify and separate objects in images or videos quickly. It operates in real-time and can be "prompted" to focus on specific objects, making it highly effective and advanced at recognizing and isolating objects from their backgrounds.

In this article, we’ll explore how to deploy the SAM 2 model to a REST API using Modelbit.

Prerequisites

To get the most out of this article, follow these steps:

1. Access the SAM 2 Model

Start by downloading the SAM 2 model from the official Meta AI repository. Open your command line interface and run the following commands:



git clone https://github.com/facebookresearch/segment-anything-2.git
cd segment-anything-2



pip install -e .

Next, download the model checkpoints by navigating to the checkpoints directory and running the script:



cd checkpoints
./download_ckpts.sh

For more detailed installation instructions, refer to the SAM 2 GitHub repository.

2. Set Up Modelbit

To deploy the SAM 2 model, you'll need a Modelbit account. Head over to the Modelbit website and sign up.
Once registered, install the Modelbit Python library by running:



pip install --upgrade modelbit

This will allow you to interact with Modelbit and deploy your SAM 2 model as a REST API endpoint.

Overview of SAM 2

The SAM 2 model, an advanced iteration of Meta AI's Segment Anything Model, significantly enhances image and video segmentation. It is engineered to deliver rapid and precise segmentation, making it six times faster than its predecessor.

SAM 2's core features include its ability to handle real-time video segmentation and its superior accuracy across complex and diverse scenarios.

Built on an extensive dataset of over 50,000 videos and millions of segmentation masks, SAM 2 can segment objects in both images and videos with exceptional detail. These capabilities make it ideal for applications in augmented reality, autonomous driving, environmental monitoring, and more.

Key Features and Enhancements of SAM 2

Memory Mechanism: Incorporates a memory encoder, memory bank, and memory attention module to store and use object information, enhancing user interaction throughout the video.

Streaming Architecture: Processes video frames sequentially, enabling real-time segmentation of long videos.

Enhanced Image Segmentation: Offers superior performance in image segmentation compared to the original SAM, with exceptional capabilities in video tasks.

Multiple Mask Prediction: Provides several potential segmentation masks when faced with uncertain image or video data.
Occlusion Prediction: Enhances the model’s ability to handle objects that are temporarily obscured or leave the frame.

Video Segmentation: Tracks objects across all video frames, effectively managing occlusion.

SAM 2 in Action

You can easily test and use SAM 2 through the Web UI provided by Meta. To get started, visit SAM 2 Web UI.

Working With SAM 2

Now that you have an understanding of SAM 2's capabilities, it's time to put it into action programmatically. Getting started with SAM 2 is straightforward. In this tutorial, we'll use SAM 2 to generate segmentation masks for an image.

In the context of image segmentation, a mask is typically a binary or multi-class image that matches the size of the input image. Each pixel in the mask is labeled or assigned a value indicating whether it belongs to a specific object or region of interest.

When you feed an image into the SAM 2 model, the mask generator will output an image where different objects—such as cars, people, or animals—are highlighted with distinct colours or binary values.
This capability is important for various real-world applications, including:

Autonomous driving: Helping vehicles recognize and differentiate between roads, pedestrians, other vehicles, and more.

Medical imaging: Allowing for the segmentation of different tissues, organs, or abnormalities within an image.

Image editing: Facilitating the isolation of specific objects from their background for easier manipulation.

Let's dive into how to get SAM 2 up and running.

Make sure you’ve downloaded SAM 2 in your development environment. If not, refer back to the prerequisites.
Next, check if a CUDA-compatible GPU is available on the system and optimize the execution of the PyTorch model accordingly by running the following code:



import numpy as np
import torch
import matplotlib.pyplot as plt
from PIL import Image

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Use autocast only if CUDA is available
if torch.cuda.is_available():
    with torch.autocast(device_type="cuda", dtype=torch.bfloat16):
        # Your GPU-specific code here
        if torch.cuda.get_device_properties(0).major >= 8:
            torch.backends.cuda.matmul.allow_tf32 = True
            torch.backends.cudnn.allow_tf32 = True
else:
    # Your CPU-specific code here
    print("CUDA is not available. Running on CPU.")

Next, create a function called display_image_with_annotations to visually represent segmentation masks on an image. This function will overlay the masks with random colours and can optionally draw borders around the segmented regions, enhancing visibility and differentiation between various segments in the image. Below is the code:



def display_image_with_annotations(image, annotations, show_borders=True):
    """
    Display an image with annotations overlaid.

    Parameters:
    image (numpy array): The image to display.
    annotations : mask.
    show_borders (bool): If True, borders around each annotation will be drawn.

    Returns:
    None
    """

    def display_annotations(annotations, show_borders=True):
        """
        Helper function to display annotations on an image.

        Parameters:
        annotations: masks
        show_borders (bool): If True, borders around each annotation will be drawn.

        Returns:
        None
        """

        # Return immediately if there are no annotations to display
        if len(annotations) == 0:
            return

        # Sort annotations by area in descending order
        sorted_annotations = sorted(annotations, key=lambda x: x['area'], reverse=True)

        # Get the current axis for plotting
        axis = plt.gca()
        axis.set_autoscale_on(False)

        # Create an empty image with an alpha channel (RGBA) to hold the annotations
        overlay_img = np.ones((sorted_annotations[0]['segmentation'].shape[0],
                            sorted_annotations[0]['segmentation'].shape[1], 4))
        overlay_img[:,:,3] = 0  # Set alpha channel to 0 (transparent)

        # Iterate through each annotation and overlay it on the image
        for annotation in sorted_annotations:
            mask = annotation['segmentation']  # Get the segmentation mask
            # Generate a random color for the mask with 50% opacity
            mask_color = np.concatenate([np.random.random(3), [0.5]])  
            overlay_img[mask] = mask_color  # Apply the mask color to the overlay image

            # If borders are enabled, draw borders around each mask
            if show_borders:
                # Find contours of the mask
                contours, _ = cv2.findContours(mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
                # Smooth the contours slightly
                contours = [cv2.approxPolyDP(contour, epsilon=0.01, closed=True) for contour in contours]
                # Draw the contours with a specified color and thickness
                cv2.drawContours(overlay_img, contours, -1, (0, 0, 1, 0.4), thickness=1)

        # Display the annotated image
        axis.imshow(overlay_img)

    # Set up the plot with a large figure size to ensure detailed visualization.
    plt.figure(figsize=(20, 20))

    # Display the image that you want to annotate.
    plt.imshow(image)

    # Call the helper function to display annotations on the image.
    display_annotations(annotations, show_borders)

    # Remove the axis labels and ticks for a cleaner display.
    plt.axis('off')

    # Render and display the final image with the annotations.
    plt.show()

To test this model, we need an image. For this tutorial, let's use a free image from Unsplash. You can download the image using the following link: Download the image from Unsplash

Make sure to download the image to your local environment for the demonstration.

Load the image in your notebook:



image = Image.open(Image Path)
image = np.array(image.convert("RGB"))

Next, initialize the SAM 2 model for the image segmentation task by running this code:



from sam2.build_sam import build_sam2
from sam2.automatic_mask_generator import SAM2AutomaticMaskGenerator

# Specify the path to the model checkpoint.
# This checkpoint contains the pre-trained weights for the SAM 2 model.
checkpoint_path = Path to model checkpoint

# Specify the configuration file for the model.
# This YAML file contains the architecture and hyperparameters used to define the SAM 2 model.
model_config = "sam2_hiera_b+.yaml"

# Build the SAM 2 model using the configuration file and checkpoint.
# The model is loaded onto the GPU (device='cuda') for faster processing.
# Post-processing is disabled (apply_postprocessing=False) to keep raw outputs.
sam2_model = build_sam2(model_config, checkpoint_path, device='cpu', apply_postprocessing=False)

# Initialize the automatic mask generator using the SAM 2 model.
# This will generate segmentation masks automatically based on the input data.
mask_generator = SAM2AutomaticMaskGenerator(sam2_model)
masks = mask_generator.generate(image)

Finally, call the display_image_with_annotations function to show the segmentation mask on the image.



display_image_with_annotations(image,masks)

Here is the result:

You can see that the model accurately segments each region of the image, highlighting different sections with precision. You can repeat this for different images and see how powerful SAM 2 is.

Deploying SAM 2 Model With Modelbit

The true value of an AI model is only realized when it is made available to end users, typically through deployment in a production environment. One effective method for achieving this is by deploying the model as a REST API. Modelbit offers a straightforward approach for rapidly deploying your AI models. You can learn more about this solution at Modelbit.

To begin deploying the SAM 2 model, import Modelbit and activate it with the following code:



import modelbit
mb = modelbit.login()

Let's deploy using Modelbit's Python method. Remember, we already have a function, display_image_with_annotations, to mask an image with SAM 2. Here’s how to do it:

Modelbit will manage all dependencies for you, including any other Python functions and variables that the function depends on.



mb.deploy(display_image_with_annotations)

Result:

Accessing the Model

The model has been successfully deployed as a REST API endpoint using Modelbit. You can access the model easily via various methods, including curl or Python. Once accessed, you can integrate the API endpoint into your applications to make inferences effortlessly.

Here is an example using Python:



modelbit.get_inference(

workspace="victorkingoshimua",

deployment="display_image_with_annotations",

data=[image, masks]

)

Final Thoughts

Deploying a model as a REST API endpoint using Modelbit simplifies the process of integrating advanced functionality into your applications. With easy access through tools like curl or Python, you can incorporate the model into your workflows, enabling efficient and scalable inferences.

In this article, you’ve learned how to effortlessly deploy one of the latest and most advanced AI models as a REST API. Whether you're working with image recognition, natural language processing, or any other AI domain, the ease of integration provided by Modelbit can help you bring sophisticated AI features to your projects with minimal effort.