Scrapfly

Posted on Mar 12 • Originally published at scrapfly.io on Mar 12

Guide To Google Image Search API and Alternatives

#api

Google Image Search API allows developers to integrate Google Image Search functionality into their applications. This API provides access to a vast collection of images indexed by Google, enabling users to search for images based on various criteria such as keywords, image type, and more.

Whether you're building an image search feature, creating a visual recognition tool, or developing content analysis software, this guide will help you understand your options for programmatically accessing image search functionality.

Is There an Official Google Image Search API?

Google previously provided a dedicated Image Search API as part of its AJAX Search API suite, but this service was deprecated in 2011. Since then, developers looking for official Google-supported methods to access image search results have had limited options.

However, Google does offer a partial solution through its Custom Search JSON API, which can be configured to include image search results. This requires setting up a Custom Search Engine (CSE) and limiting it to image search, but it comes with significant limitations:

Quota restrictions : The free tier is limited to 100 queries per day
Commercial use fees : Usage beyond the free tier requires payment
Limited results : Each query returns a maximum of 10 images per request
Restricted customization : Fewer filtering options compared to the original Image Search API

For developers needing more robust image search capabilities, exploring alternative services is often necessary.

Google Image Search Alternatives

While Google does not provide an official Image Search API, there are several alternatives available:

Bing Image Search API

Microsoft's Bing Image Search API provides a comprehensive solution for integrating image search capabilities into applications. Part of the Azure Cognitive Services suite, this API offers advanced search features and returns detailed metadata about images.

import requests

subscription_key = "YOUR_SUBSCRIPTION_KEY"
search_url = "https://api.bing.microsoft.com/v7.0/images/search"
search_term = "mountain landscape"

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
params = {"q": search_term, "count": 10, "offset": 0, "mkt": "en-US", "safeSearch": "Moderate"}

response = requests.get(search_url, headers=headers, params=params)
response.raise_for_status()
search_results = response.json()

# Process the results
for image in search_results["value"]:
    print(f"URL: {image['contentUrl']}")
    print(f"Name: {image['name']}")
    print(f"Size: {image['width']}x{image['height']}")
    print("---")

In the above code, we're sending a request to the Bing Image Search API with our search term and additional parameters. The API returns a JSON response containing image URLs, names, and dimensions, which we can then process according to our application's needs.

The Bing API offers competitive pricing with a free tier that includes 1,000 transactions per month, making it accessible for small projects and testing before scaling.

DuckDuckGo Image Search

DuckDuckGo doesn't offer an official API for image search, but it's worth noting that their image search results are primarily powered by Bing's search engine. For developers looking for a more privacy-focused approach, some have created unofficial wrappers around DuckDuckGo's search functionality.

Since this method relies on web scraping, you should have prior knowledge of it. If you're interested in learning more about web scraping and best practices, check out our article.

[

Everything to Know to Start Web Scraping in Python Today

Ultimate modern intro to web scraping using Python. How to scrape data using HTTP or headless browsers, parse it using AI and scale and deploy.

](https://scrapfly.io/blog/everything-to-know-about-web-scraping-python/)

Now, let's move on to the example.

from playwright.sync_api import sync_playwright
from bs4 import BeautifulSoup

def scrape_duckduckgo_images():
    # Start Playwright in a context manager to ensure clean-up
    with sync_playwright() as p:
        # Launch the Chromium browser in non-headless mode for visual debugging
        browser = p.chromium.launch(headless=False)
        page = browser.new_page()

        # Navigate to DuckDuckGo image search for 'python'
        page.goto("https://duckduckgo.com/?q=python&iax=images&ia=images")

        # Wait until the images load by waiting for the image selector to appear
        page.wait_for_selector(".tile--img__img")

        # Get the fully rendered page content including dynamically loaded elements
        content = page.content()

        # Parse the page content using BeautifulSoup for easier HTML traversal
        soup = BeautifulSoup(content, "html.parser")
        images = soup.find_all("img")

        # Loop through the first three images only
        for image in images[:3]:
            # Safely extract the 'src' attribute with a default message if not found
            src = image.get("src", "No src found")
            # Safely extract the 'alt' attribute with a default message if not found
            alt = image.get("alt", "No alt text")
            print(src) # Print the image source URL
            print(alt) # Print the image alt text
            print("---------------------------------")

        # Close the browser after the scraping is complete
        browser.close()

scrape_duckduckgo_images()

Example Output


//external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse3.mm.bing.net%2Fth%3Fid%3DOIP.jrcuppJ7JfrVrpa9iKnnnAHaHa%26pid%3DApi&f=1&ipt=a11d9de5b863682e82564114f090c443350005fe945cfdfdba2ca1a05a43fa2b&ipo=images
Advanced Python Tutorials - Real Python
---------------------------------
//external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse2.mm.bing.net%2Fth%3Fid%3DOIP.Po6Ot_fcf7ya7xkrOL27hQHaES%26pid%3DApi&f=1&ipt=156829965359c98ab2bbc69fb73e2a4963284ff665c83887d6278d6cecc08841&ipo=images
¿Para qué sirve Python?
---------------------------------
//external-content.duckduckgo.com/iu/?u=https%3A%2F%2Ftse4.mm.bing.net%2Fth%3Fid%3DOIP._zLHmRNYHt-KYwYC8cC3RwHaHa%26pid%3DApi&f=1&ipt=04bdcfc11eee3ef4e96bf7d1b47230633b7c936363cf0c9f86c5dfa2e6fb4f32&ipo=images
¿Qué es Python y por qué debes aprender

In the above code, we're making a request to DuckDuckGo's search page with parameters that trigger the image search interface. However, this approach requires web scraping.

Can Google Images be Scraped?

Scraping Google Images is technically possible and can be a good approach when API options don't meet your specific requirements. But there are several echnical obstacles that make it a complex and often unreliable approach

Google Blocks Bots Aggressively : Google actively detects and blocks automated scraping, requiring constant evasion tactics.
Headless Browsers Required : Running Selenium or Puppeteer in headless mode is usually necessary to mimic real users.
Page Structure Changes Frequently : Google updates its layout and elements, breaking scrapers that rely on fixed XPath or CSS selectors.
High Resource Consumption : Running Selenium-based automation in a full browser environment significantly increases CPU and memory usage compared to API-based solutions.

For many applications, using an official API from Bing or another provider is a more sustainable approach. However, for specific use cases or when other options aren't viable, let's explore some effective scraping techniques.

Scrapfly Web Scraping API

ScrapFly provides web scraping, screenshot, and extraction APIs for data collection at scale.

Anti-bot protection bypass - scrape web pages without blocking!
Rotating residential proxies - prevent IP address and geographic blocks.
JavaScript rendering - scrape dynamic web pages through cloud browsers.
Full browser automation - control browsers to scroll, input and click on objects.
Format conversion - scrape as HTML, JSON, Text, or Markdown.
Python and Typescript SDKs, as well as Scrapy and no-code tool integrations.

Here's an example of how to scrape a google images with the Scrapfly web scraping API:

from scrapfly import ScrapflyClient, ScrapeConfig, ScrapeApiResponse

scrapfly = ScrapflyClient(key="YOUR_SCRAPFLY_KEY")

result: ScrapeApiResponse = scrapfly.scrape(ScrapeConfig(
    tags=[
    "player","project:default"
    ],
    format="json",
    extraction_model="search_engine_results",
    country="us",
    lang=[
    "en"
    ],
    asp=True,
    render_js=True,
    url="https://www.google.com/search?q=python&tbm=isch"
))

Example Output


{
    "query": "python - Google Search",
    "results": [
        {
            "displayUrl": null,
            "publishDate": null,
            "richSnippet": null,
            "snippet": null,
            "title": "Wikipedia Python (programming language) - Wikipedia",
            "url": "https://en.wikipedia.org/wiki/Python_(programming_language)"
        },
        {
            "displayUrl": null,
            "publishDate": null,
            "richSnippet": null,
            "snippet": null,
            "title": "Juni Learning What is Python Coding? | Juni Learning",
            "url": "https://junilearning.com/blog/guide/what-is-python-101-for-students/"
        },
        {
            "displayUrl": null,
            "publishDate": null,
            "richSnippet": null,
            "snippet": null,
            "title": "Wikiversity Python - Wikiversity",
            "url": "https://en.wikiversity.org/wiki/Python"
        },
        ...
   }

Try for FREE!

Scrape Google Image Search using Python

For a direct approach to scraping Google Images using Python, the following code demonstrates how to extract image data using Requests and BeautifulSoup:

import requests
from bs4 import BeautifulSoup
import random
import time
from lxml import etree # For XPath support

def scrape_google_images_bs4(query, num_results=20):
    # Encode the search query
    encoded_query = query.replace(" ", "+")
    # Set up headers to mimic a browser
    user_agents = [
        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36",
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.107 Safari/537.36"
    ]
    headers = {
        "User-Agent": random.choice(user_agents),
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
        "Accept-Language": "en-US,en;q=0.5",
        "Referer": "https://www.google.com/"
    }

    # Make the request
    url = f"https://www.google.com/search?q={encoded_query}&tbm=isch"
    response = requests.get(url, headers=headers)

    if response.status_code != 200:
        print(f"Failed to retrieve the page: {response.status_code}")
        return []

    # Parse the HTML using both BeautifulSoup and lxml for XPath
    soup = BeautifulSoup(response.text, 'html.parser')
    dom = etree.HTML(str(soup)) # Convert to lxml object for XPath

    # Process the response
    image_data = []

    # Use XPath to select divs instead of class-based selection
    # This pattern selects all similar divs in the structure
    base_xpath = "/html/body/div[3]/div/div[14]/div/div[2]/div[2]/div/div/div/div/div[1]/div/div/div"

    # Get all div indices to match the pattern
    div_indices = range(1, num_results + 1) # Start with 1 through num_results

    for i in div_indices:
        try:
            # Create XPath for the current div
            current_xpath = f"{base_xpath}[{i}]"
            div_element = dom.xpath(current_xpath)

            if not div_element:
                continue

            item = {}

            # Get the data-lpage attribute (page URL) from the div
            page_url_xpath = f"{current_xpath}/@data-lpage"
            page_url = dom.xpath(page_url_xpath)
            if page_url:
                item["page_url"] = page_url[0]

            # Get the alt text of the image
            alt_xpath = f"{current_xpath}//img/@alt"
            alt_text = dom.xpath(alt_xpath)
            if alt_text:
                item["alt_text"] = alt_text[0]

            if item:
                image_data.append(item)

            # Stop if we've reached the requested number of results
            if len(image_data) >= num_results:
                break

        except Exception as e:
            print(f"Error processing element {i}: {e}")

    return image_data

# Example usage
image_data = scrape_google_images_bs4("python", num_results=5)
print(image_data)

Example Output


[{'page_url': 'https://en.wikipedia.org/wiki/Python_(programming_language)', 'alt_text': '\u202aPython (programming language) - Wikipedia\u202c\u200f'},
{'page_url': 'https://beecrowd.com/blog-posts/best-python-courses/', 'alt_text': '\u202aPython: find out the best courses - beecrowd\u202c\u200f'},
{'page_url': 'https://junilearning.com/blog/guide/what-is-python-101-for-students/', 'alt_text': '\u202aWhat is Python Coding? | Juni Learning\u202c\u200f'},
{'page_url': 'https://medium.com/towards-data-science/what-is-a-python-environment-for-beginners-7f06911cf01a', 'alt_text': "\u202aWhat Is a 'Python Environment'? (For Beginners) | by Mark Jamison | TDS Archive | Medium\u202c\u200f"},
{'page_url': 'https://quantumzeitgeist.com/why-is-the-python-programming-language-so-popular/', 'alt_text': '\u202aWhy Is The Python Programming Language So Popular?\u202c\u200f'}]

In the above code, we created a Google Images scraper that uses XPath targeting instead of class-based selectors for better reliability. The script mimics browser behavior with rotating user agents, fetches search results for a given query, and extracts both the source page URL (data-lpage attribute) and image alt text from the search results.

Scrape Google Reverse Image Search using Python

Reverse image search allows you to find similar images and their sources using an image as the query instead of text. Implementing this requires a slightly different approach, often involving browser automation with tools like Selenium.

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
import time

def google_reverse_image_search(image_url, max_results=5):
    # Set up Chrome options
    chrome_options = Options()
    # chrome_options.add_argument("--headless") # Run in headless mode
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")
    chrome_options.add_argument("--disable-gpu")
    chrome_options.add_argument("--window-size=1920,1080")
    chrome_options.add_argument("--lang=en-US,en")
    chrome_options.add_experimental_option('prefs', {'intl.accept_languages': 'en-US,en'})
    chrome_options.add_experimental_option("excludeSwitches", ["enable-automation"])
    chrome_options.add_experimental_option('useAutomationExtension', False)
    chrome_options.add_argument("--disable-blink-features=AutomationControlled")
    chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")

    # Initialize the driver
    driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()), options=chrome_options)

    try:
        # Navigate to Google Images
        driver.get("https://www.google.com/imghp?hl=en&gl=us")

        # Find and click the camera icon for reverse search
        camera_button = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, "//div[@aria-label='Search by image']"))
        )
        camera_button.click()

        # Wait for the URL input field and enter the image URL
        url_input = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.XPATH, "//input[@placeholder='Paste image link']"))
        )
        url_input.send_keys(image_url)

        # Click search button
        search_button = WebDriverWait(driver, 10).until(
            EC.element_to_be_clickable((By.XPATH, "//div[text()='Search']"))
        )
        search_button.click()

        # Wait for results page to load
        WebDriverWait(driver, 15).until(
            EC.presence_of_element_located((By.XPATH, "//div[contains(text(), 'All')]"))
        )

        # Extract similar image results
        similar_images = []

        # Click on "Find similar images" if available
        try:
            # Extract image data
            for i in range(max_results):
                try:
                    # Get image element using index in XPath
                    img_xpath = f"/html/body/div[3]/div/div[12]/div/div/div[2]/div[2]/div/div/div[1]/div/div/div/div/div/div/div[{i+1}]/div/div/div[1]/div/div/div/div/img"
                    img = WebDriverWait(driver, 5).until(
                        EC.presence_of_element_located((By.XPATH, img_xpath))
                    )

                    # Get image URL by clicking and extracting from larger preview
                    img.click()
                    time.sleep(1) # Wait for larger preview

                    # Find the large image
                    img_container = WebDriverWait(driver, 5).until(
                        EC.presence_of_element_located((By.XPATH, "//*[@id='Sva75c']/div[2]/div[2]/div/div[2]/c-wiz/div/div[2]/div/a[1]"))
                    )

                    img_url = driver.find_element(By.XPATH, "//*[@id='Sva75c']/div[2]/div[2]/div/div[2]/c-wiz/div/div[2]/div/a[1]/img").get_attribute("src")

                    # Get source website
                    source_url = img_container.get_attribute("href")

                    similar_images.append({
                        "url": img_url,
                        "source_url": source_url,
                    })
                except Exception as e:
                    print(f"Error extracting image {i+1}: {e}")
        except Exception as e:
            print(f"Could not find 'similar images' link: {e}")

        return similar_images

    finally:
        # Clean up
        driver.quit()

# Example usage
sample_image_url = "https://avatars.githubusercontent.com/u/54183743?s=280&v=4"
similar_images = google_reverse_image_search(sample_image_url)

print("Similar Images:")
for idx, img in enumerate(similar_images, 1):
    print(f"Image {idx}:")
    print(f" URL: {img['url']}")
    print(f" Source: {img['source_url']}")
    print()

In the above code, we're using Selenium to automate the process of performing a reverse image search. This approach simulates a user visiting Google Images, clicking the camera icon, entering an image URL, and initiating the search. The full implementation would include parsing the results page to extract similar images, websites containing the image, and other relevant information.

This method requires more resources than simple HTTP requests but provides access to functionality that isn't easily available through direct scraping. For production use, you would need to add error handling, result parsing, and potentially proxy rotation to avoid detection.

FAQ

Is there an official Google Image Search API?

No, Google does not offer an official Image Search API. The previously available Google Image Search API was deprecated and is no longer supported.

What are the alternatives to Google Image Search API?

Alternatives to Google Image Search API include Bing Image Search API, DuckDuckGo Image Search, and image search APIs from other search engines like Yahoo and Yandex.

Can I scrape Google Images?

Scraping Google Images is possible, but it comes with challenges and legal considerations. It's important to use ethical scraping practices and consider using APIs provided by other search engines as alternatives.

Summary

In this article, we explored the Google Image Search API, its alternatives, and how to scrape Google Image Search results using Python. While Google does not offer an official Image Search API, developers can use the Google Custom Search JSON API or alternatives like Bing Image Search API and DuckDuckGo Image Search. Additionally, we discussed the challenges of scraping Google Images and provided example code snippets for scraping image search results.

DEV Community

Guide To Google Image Search API and Alternatives

Is There an Official Google Image Search API?

Google Image Search Alternatives

Bing Image Search API

DuckDuckGo Image Search

Example Output

Can Google Images be Scraped?

Scrapfly Web Scraping API

Example Output

Scrape Google Image Search using Python

Example Output

Scrape Google Reverse Image Search using Python

FAQ

Is there an official Google Image Search API?

What are the alternatives to Google Image Search API?

Can I scrape Google Images?

Summary

Top comments (0)

Read next

Documentation for your Supabase API! - Supaweek Day 3

How to Promote and Market your API: Unified APIs

What is the Wikipedia API? How to Use It and Alternatives

What's the Best Movie Database API? IMDb vs TMDb vs OMDb