2captcha

Posted on Oct 29

Bypassing CAPTCHA with Python: What’s the Challenge? Certain Nuances Exist

I am not a developer by profession, but I work closely within those circles — let’s call it the Python-adjacent crowd. I know some developers, and even more so, colleagues who specialize in the field.

As it happens, I needed to set up a CAPTCHA-solving mechanism using Python for an important project — an Amazon parser. I tried countless approaches, burned through tons of GPT prompts, but the infamous Amazon CAPTCHA was unyielding. Eventually, through a bit of crowd-sourcing (well, I asked for help, and a colleague provided the solution), a Python script for CAPTCHA bypass was created. Sharing it publicly is driven purely by benevolent motives with a bit of personal interest. I need advice on optimizing this script’s performance, as even after extensive testing, it remains unstable. We’ve tried multiple approaches, broken several libraries, and still haven’t achieved a reliable script.

Let’s go over the steps:

General Overview of the Python CAPTCHA Bypass Script

The CAPTCHA bypass script in Python works as follows:

Imports the necessary libraries.
Configures proxy settings.
Opens Amazon’s registration page.
Solves the first CAPTCHA (skips if none appears).
Completes the registration form.
Solves the second CAPTCHA using a coordinate-based method.
Verifies the CAPTCHA solution.
Closes the browser.

Now, let’s take a closer look at each step.

Required Python Libraries for CAPTCHA Solving

The script relies on several libraries:

os, base64, BytesIO - standard Python libraries used for file system operations, encoding images in base64, and working with byte streams (these libraries support CAPTCHA-solving in image format).
seleniumbase.Driver, selenium.webdriver.common.by.By, selenium.webdriver.common.action_chains.ActionChains - These libraries enable Selenium-driven browser control, allowing element searches on web pages and complex actions (like clicking specific coordinates). Essentially, this set is focused on coordinate-based CAPTCHA solving, though all CAPTCHA interactions here depend on Selenium, making it a key library set.
TwoCaptcha: This library connects to the 2Captcha service for automated CAPTCHA recognition (naturally, Python CAPTCHA solving is implemented through an external service, in our case, 2Captcha).

Proxy Configuration for Script Functionality

Initially, the script was designed to retrieve proxies from a file. But as I’m somewhat lazy (What? Create a proxy file, figure out the right format, and prepare the proxies? You’ve got to be kidding!), I added the option to use proxies directly from the code.

The script checks first if a proxy file exists, and if it doesn’t, it loads the proxy from the code. If neither contains proxy data, the script simply gives up and shuts down (just like some people — “I’m a simple person, if no tasks are assigned, I just scroll through videos”).

In modern web scraping, bypassing CAPTCHA without proxies is almost unheard of (especially for high-volume scraping), regardless of whether you’re using Python or another language.

Opening the Amazon Registration Page

Next, the script opens Amazon’s registration page (handled by the function driver.uc_open_with_reconnect). After all, the goal here is to complete the registration on the site, not just solve CAPTCHA (though it currently isn’t doing much beyond that, to be honest).

Solving the First CAPTCHA in Python

Finally, we arrive at the core function (or rather, part of it) — CAPTCHA bypass or skip if no CAPTCHA appears. We’re talking about Amazon’s simple text CAPTCHA, which sometimes shows up, sometimes doesn’t (Amazon seems to have its off days).

Here’s how it works: the script uses Selenium to locate the CAPTCHA image on the page, takes a screenshot, converts it to base64, and sends the encoded data to 2Captcha. Once the response is received, the text is entered into a designated field, and the “Continue” button is clicked.

If no CAPTCHA is present, this step is skipped, and the script proceeds to the next section.

Filling Out the Registration Form

As you remember, the script opened the registration page before attempting to solve the CAPTCHA. Now, it returns to that page to fill out the registration form.

The registration form data is loaded directly from the code. This approach was chosen to simplify the CAPTCHA recognition module, though ideally, this could be improved by adding a solution that pulls form data from a file, filling out multiple forms automatically.

Python CAPTCHA Bypass via Coordinate-Based Method

Throughout the testing phase, we encountered no issues with the first CAPTCHA (Amazon seemed to be in a good mood and didn’t issue the text CAPTCHA). However, the second CAPTCHA raised questions, both for the script and the community.

Here’s the breakdown:

The second CAPTCHA is more complex and requires clicking on specific coordinates to solve it. The script’s workflow is as follows: it takes a screenshot, sends it to the service to get coordinates, and then uses ActionChains to move the cursor to those coordinates, switches to the frame, and clicks the confirm button.

But here comes the snag. Occasionally, either during recognition or verification, the process stalls. The CAPTCHA might not resolve, or the allotted time for solving expires before the correct result is returned. While most cases proceed smoothly, this rare exception is puzzling and persistent.

Below is the script itself. It may be useful to someone, or perhaps someone in the comments can suggest improvements to address this issue?

Or maybe it can’t be fixed, because, after all — “This is Amazon”...

import os
import base64
from io import BytesIO
from seleniumbase import Driver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from twocaptcha import TwoCaptcha  # pip3 install 2captcha-python

# Manual proxy input
manual_proxy = "http://login:password@ip:port"  # Replace with your proxy

# Function to read proxy from file
def get_proxy_from_file(file_path):
    if os.path.exists(file_path):
        with open(file_path, 'r') as file:
            proxy = file.read().strip()
            return proxy
    return None

# Attempt to connect via proxy from file or use manual input
proxy_file_path = "proxy.txt"  # Proxy file name
proxy = get_proxy_from_file(proxy_file_path) or manual_proxy

my_key = "2Captcha API Key"
solver = TwoCaptcha(my_key, defaultTimeout=70)
agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.0.0 Safari/537.36"

# Initialize driver with proxy
driver = Driver(uc=True, headless=False, proxy=proxy, agent=agent)  # headless=True for invisible mode

try:
    url = "https://www.amazon.com/ap/register?openid.pape.max_auth_age=0&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&pageId=usflex&ignoreAuthState=1&openid.assoc_handle=usflex&openid.mode=checkid_setup&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&prepopulatedLoginId=&failedSignInCount=0&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&disableLoginPrepopulate=1&switch_account=signin&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0"
    driver.uc_open_with_reconnect(url, 5)

    # Solving first CAPTCHA
    try:
        my_img = driver.find_element("body > div > div.a-row.a-spacing-double-large > div.a-section > div > div > form > div.a-row.a-spacing-large > div > div > div.a-row.a-text-center > img")
        print("SOLVE FIRST CAPTCHA...")
        screenshot = my_img.screenshot_as_png
        screenshot_bytes = BytesIO(screenshot)
        base64_screenshot = base64.b64encode(screenshot_bytes.getvalue()).decode('utf-8')
        result = solver.normal(base64_screenshot)
        print('result: ' + str(result))  # Send request to 2Captcha
        res = result['code']
        driver.find_element(By.ID, "captchacharacters").send_keys(f"{res}")
        driver.find_element(By.CLASS_NAME, "a-button-inner").click()
    except:
        pass

    # Waiting and form filling
    driver.sleep(1)
    print("Fill out the form")
    driver.find_element(By.ID, "ap_customer_name").send_keys("Alex0053")  # Fill form
    driver.find_element(By.ID, "ap_email").send_keys("some_post43120@gmail.com")
    driver.find_element(By.ID, "ap_password").send_keys("password40000A#")
    driver.find_element(By.ID, "ap_password_check").send_keys("password40000A#")
    driver.find_element(By.ID, "continue").click()
    driver.sleep(10)

    # Second CAPTCHA - coordinate-based
    while True:
        try:
            cap_img = driver.find_element(By.ID, "cvf-aamation-challenge-iframe")  # Frame element
            print("SOLVE SECOND-COORD CAPTCHA...")
            screenshot = cap_img.screenshot_as_png
            screenshot_bytes = BytesIO(screenshot)
            base64_screenshot = base64.b64encode(screenshot_bytes.getvalue()).decode('utf-8')
            element_position = cap_img.location
            result = solver.coordinates(base64_screenshot, lang='en', min_clicks=1, max_clicks=1)
            x = str(result['code']).split(":")[1].split(",")[0].replace("x=", "")
            y = str(result['code']).split(":")[1].split(",")[1].replace("y=", "")
            print('result: ' + str(result))
            x_coord = element_position["x"] + int(x)
            y_coord = element_position["y"] + int(y)
            actions = ActionChains(driver)
            actions.move_by_offset(x_coord, y_coord).click().perform()
            driver.sleep(2)
            actions.reset_actions()
            driver.switch_to_frame(cap_img)
            driver.find_element(By.ID, "amzn-btn-verify-internal").click()
            driver.switch_to.default_content()
            driver.sleep(7)
        except Exception as e:
            print(e)
        try:
            driver.find_element('form[id="verification-code-form"]')
            print("CAPTCHA PASSED!!!")
            break
        except:
            pass

    driver.sleep(3)
    # Last block, if needed

except Exception as e:
    print(e)
finally:
    driver.close()
    driver.quit()

Python CAPTCHA Bypass – Missing Part of the Code

Amazon also has a third type of CAPTCHA, FunCaptcha, which I couldn’t crack in this context. So, I just removed it from this code, just in case. I haven’t encountered FunCaptcha throughout testing (but from the tales of wiser folk, I know it exists somewhere in Amazon’s depths). There’s a legend about a specially trained man, we’ll call him the Overseer, who manually changes CAPTCHA conditions or the page design.

No one has ever seen this Overseer, but in the evenings, when IT folks gather around a fire, they use this tale to scare the juniors.

So, the script doesn’t solve FunCaptcha, but I’m open to suggestions on how to enable it to do so.

Conclusion

The script works, CAPTCHA is bypassed, but occasional stalls remain – it would be helpful if you could point out any improvements (preferably without harsh critique).

The script doesn’t handle FunCaptcha, though at first glance, we don’t really need it, but if it is required — I’m also open to community input.

Top comments (1)

Muzakir Shah • Nov 6

The more challenging part is solving the captcha v3 enterprise.
Finding the action parameter of captcha v3 is challenging. There is a github repo which can tell you what captcha, sitekey and pageurl of website is but that didn't give you the action parameters. And that is where you got confuse.

DEV Community

Bypassing CAPTCHA with Python: What’s the Challenge? Certain Nuances Exist

General Overview of the Python CAPTCHA Bypass Script

Required Python Libraries for CAPTCHA Solving

Proxy Configuration for Script Functionality

Opening the Amazon Registration Page

Solving the First CAPTCHA in Python

Filling Out the Registration Form

Python CAPTCHA Bypass via Coordinate-Based Method

Python CAPTCHA Bypass – Missing Part of the Code

Conclusion

Top comments (1)

Read next

A Walkthrough of Solidity Custom Errors

Building a File Upload App with TypeScript, React, and Auto-Drive API

Preparing extensions for Joomla 6. CMSObject -> stdClass.

🔍 MongoDB Data Modeling: Embedding vs. Referencing - A Strategic Choice!