Lewis Kerr

Posted on Oct 28

Web Scraping with Rotating Proxies: An Example with Python Requests and Selenium

#python #selenium #proxy #swiftproxy

Using rotating proxies for web scraping is an effective way, especially when you need to access the website frequently or bypass anti-crawler mechanisms. Rotating proxies can automatically change IP addresses, thereby reducing the risk of being blocked.

The following is an example of using rotating proxies with Python's requests library and Selenium for web scraping.

Using the `requests` library

‌1. Install necessary libraries‌:

First, you need to install the requests library.
‌

2. Configure rotating proxy‌:

You need to get an API key or proxy list from the rotating proxy service provider and configure them in requests.

3. Send requests‌:

Use the requests library to send HTTP requests and forward them through the proxy.

Sample code:

import requests 
from some_rotating_proxy_service import get_proxy  # Assuming this is the function provided by your rotating proxy service 

#Get a new proxy 
proxy = get_proxy() 

# Set the proxy's HTTP and HTTPS headers (may vary depending on the proxy service's requirements) 
proxies = { 
    'http': f'http://{proxy}', 
    'https': f'https://{proxy}' 
} 

# Sending a GET request 
url = 'http://example.com' 
try: 
    response = requests.get(url, proxies=proxies) 
    # Processing Response Data 
    print(response.text) 
except requests.exceptions.ProxyError: 
    print('Proxy error occurred') 
except Exception as e: 
    print(f'An error occurred: {e}')

Using `Selenium`

‌1. Install necessary libraries and drivers‌:

Install the Selenium library and the WebDriver for your browser (such as ChromeDriver).

2‌. Configure rotating proxies‌:

Similar to requests, you need to get the proxy information from the rotating proxy service provider and configure them in Selenium.

‌3. Launch a browser and set the proxy‌:

Launch a browser using Selenium and set the proxy through the browser options.

Sample code:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options 
from some_rotating_proxy_service import get_proxy  # Assuming this is the function provided by your rotating proxy service 

# Get a new proxy 
proxy = get_proxy() 

# Set Chrome options to use a proxy 
chrome_options = Options() 
chrome_options.add_argument(f'--proxy-server=http://{proxy}') 

# Launch Chrome browser 
driver = webdriver.Chrome(options=chrome_options) 

# Visit the website 
url = 'http://example.com' 
driver.get(url) 

# Processing web data 
# ...（For example, use driver.page_source to get the source code of a web page, or use driver to find a specific element.） 

# Close the browser 
driver.quit()

Things to note

Make sure the rotating proxy service is reliable and provides enough proxy pools to avoid frequent IP changes and blockages.
Plan your scraping tasks properly according to the pricing and usage limits of the rotating proxy service.
When using Selenium, pay attention to handling browser window closing and resource release to avoid memory leaks or other problems.
Comply with the target website's robots.txt file and crawling agreement to avoid legal disputes.

DEV Community

Web Scraping with Rotating Proxies: An Example with Python Requests and Selenium

Using the `requests` library

‌1. Install necessary libraries‌:

2. Configure rotating proxy‌:

3. Send requests‌:

Using `Selenium`

‌1. Install necessary libraries and drivers‌:

2‌. Configure rotating proxies‌:

‌3. Launch a browser and set the proxy‌:

Things to note

Top comments (0)

Read next

How Generative AI Works: Understanding the Magic Behind AI Creativity

Webhooks in Django: A Comprehensive Guide

A Guide to Unsupervised Image Segmentation using Normalized Cuts (NCut) in Python

Build a Secure Python Password Generator Using Secret Lib

Using the requests library

‌1. Install necessary libraries‌:

2. Configure rotating proxy‌:

3. Send requests‌:

Using Selenium

‌1. Install necessary libraries and drivers‌:

2‌. Configure rotating proxies‌:

‌3. Launch a browser and set the proxy‌:

Things to note

Read next

How Generative AI Works: Understanding the Magic Behind AI Creativity

Webhooks in Django: A Comprehensive Guide

A Guide to Unsupervised Image Segmentation using Normalized Cuts (NCut) in Python

Build a Secure Python Password Generator Using Secret Lib

Using the `requests` library

Using `Selenium`