DEV Community

Cover image for Selenium Proxy 101: How to Setup Proxies on Selenium
FeleciawSpradleyf
FeleciawSpradleyf

Posted on

Selenium Proxy 101: How to Setup Proxies on Selenium

Are you planning on using Selenium for automated testing or web scraping? Depending on your specific project requirements, you might need proxies. Come in now to discover our Selenium proxy top picks.

Selenium Proxies The importance of Selenium cannot be overemphasized. If it is not being used in automated testing, you get to see web scrapers utilizing them for scraping data off JavaScript featured websites. In the two areas in which Selenium is used extensively, proxies are required.

In some instances, you can get away without using proxies; in others, proxies are a must except if you are ready to use other expensive options. This article will be used to discuss the proxies you can use together with the Selenium library for it to function effectively.

Before discussing the proxies, we are going to be taking a look at an overview of Selenium and why you need proxies for Selenium. You are also going to learn how to setup Selenium to work with proxies.


What is Selenium?

https://www.youtube.com/watch?v=Jdkrj2lDAEY Selenium is a browser automating tool. With this tool, browsers can be automated to carry out tasks such as filling forms, visiting a website, and doing all kinds of tasks you can do with a browser. It is used majorly for automated testing.

It is also being used for web scraping since it can be used to view web pages and has some web scraping capabilities. Selenium has supports for a good number of browsers, including Chrome, Internet Explorer, and Firefox. Older versions of Selenium have support for headless browsers such as PhantomJS.

Its language support is also one of the things that make it popular among developers as it provides support for Python, Java, JavaScript, C#, and Ruby.


Why you Need Proxies for Selenium?

Proxies are not a must. However, depending on the project requirement, you will need to use proxies. As stated earlier, Selenium is used for automated testing and web scraping. For automated testing, you actually do not need proxies except if you are testing for localization.

Take, for instance, you are developing sites for different regions, and you would want to test if the language that appears for certain regions is the language used there. Aside from localization, there is actually no reason you will want to use proxies for automated testing. Selenium Proxies advantage In the area of web scraping, proxies are also required if there is a need for localized web content. They are also required when you are going to be sending too many requests to a website in a short period of time.


Where to find the Best Selenium Proxies

There is actually nothing like best proxies for Selenium because Selenium itself does not require proxies. the site you intend to use Selenium on determines the proxies you should use. Because of this, we are going to be providing you recommendations on proxies that cut across the datacenter and residential proxy categories.


Residential Proxies for Selenium

Residential proxies are the proxy of choice for the Selenium web driver. This is because, unlike datacenter proxies, residential proxies do not easily get detected. This is because they route clients’ requests through residential IPs, and these types of IPs earn more trust than datacenter IPs. Residential proxies are good for accessing complex sites such as Instagram, Google, and YouTube, among others. Some of the residential proxy providers for Selenium are discussed below.


Luminati

Luminati

  • IP Pool Size: Over 40 million
  • Locations: All countries in the world
  • Concurrency Allowed: Unlimited
  • Bandwidth Allowed: Starts at 40GB
  • Cost: Starts at $500 monthly for 40GB

Luminati is arguably the best residential proxy provider in the market. It is the largest proxy network in the world, with over 40 million residential IP addresses in its pool. There are two reasons that make Luminati residential proxies perfect for Selenium. The most important one being that Luminati has proxies in every country and in most cities around the world.

This means that you can target specific locations when using their proxies, and this is perfect for testing content localization using Selenium. Luminati has got high-rotating proxies, which will reassign you a different IP Address after every web request, making it difficult to be blocked and, as such, perfect for web scraping.


Smartproxy

Smartproxy

  • IP Pool Size: Over 10 million
  • Locations: 195 locations across the globe
  • Concurrency Allowed: Unlimited
  • Bandwidth Allowed: Starts at 5GB
  • Cost: Starts at $75 monthly for 5GB

Smartproxy is another residential proxy service with premium proxies perfect for accessing websites with a smart anti-spam system and for content localization testing using Selenium. Just like Luminati, Smartproxy has got good location coverage with proxies in about 195 countries in the world and over in 8 major cities around the world. They have got high-rotating proxies as well. Smartproxy is the proxy provider of choice among those that want to use premium proxies but have a small budget. With $75, you can buy 5GB from them as opposed to Luminati that you require $450.


Stormproxies

Stormproxies Logo

  • IP Pool Size: 40,000
  • Locations: the US and EU region only
  • Concurrency Allowed: only one device per port
  • Cost: Starts at $50 monthly for 10 ports

Luminati and Smartproxy have one problem in common – their proxies come with exhaustible bandwidth. That’s, their proxies are metered, and after consuming the bandwidth allocated to you, you won’t be able to use their proxies again until you pay for additional bandwidth. Stormproxies residential proxies come with inexhaustible bandwidth – you are allowed an unlimited bandwidth usage. However, for performance sake, the number of threads you can create is limited. Stormproxies residential proxies are perfect for web scraping and can be used together with Selenium to access a good number of sites.


Datacenter Proxies for Selenium

Datacenter proxies are the cheapest proxies you can get in the market. They make use of IP Addresses owned by data centers. Because their IP Addresses are assigned by datacenter, they are easily detected and banned. Some of them have proven to evade detections and bans. Some of these are discussed above.


Myprivateproxy

Myprivateproxy

  • Locations: US and EU region only
  • Concurrency Allowed: Up to 100 threads
  • Bandwidth Allowed: Unlimited
  • Cost: $1.49 per proxy for a month

MyPrivateProxy is arguably the best datacenter proxy provider in the market. Its proxies are some of the fastest – they are also secure and reliable. With MyPrivateProxy datacenter proxies, you can use Selenium for web scraping non-localized web content. This is because MyPrivateProxy only has a few location support, and as such, it is not a good proxy provider for automating localization testing, but it works quite great for web scraping. Some of the datacenter of MyPrivateProxy are powered by green energy sources. Their proxies are quite cheap.


Highproxies

Highproxies

  • Locations: 10 countries
  • Concurrency Allowed: Unlimited
  • Bandwidth Allowed: Unlimited
  • Cost: $1.40 per proxy for a month

Highproxies datacenter proxies can be a good choice for both web scraping and automating localization testing. This is because unlike MyPrivateProxy, Highproxies has proxies in a good number of countries, including the United States, Canada, Italy, Israel, Spain, Germany, France, the Netherlands, Japan, and Australia. Highproxies perform well in terms of speed, reliability, and security. Highproxies datacenter proxies do not easily get blocked by websites as they are not easily detected.

Highproxies, just like MyPrivateProxy, can be used on some complex websites such as Facebook and Twitter without any problem. They are, however, more expensive than other datacenter proxies on the list.


InstantProxies

InstantProxies

  • Locations: Worldwide
  • Concurrency Allowed: Unlimited
  • Bandwidth Allowed: Unlimited
  • Cost: $1.00 per proxy for a month

I stated above that MyPrivateProxy datacenter proxies are cheap. InstantProxies are actually cheaper. In fact, with only $10, you will have access to 10 proxies to make use of. InstantProxies supports a good number of locations but does not give you the chance to select the location by yourself. Before selling proxies for you, InstantProxies test the proxies to make sure they are working so as to avoid wasting your time. Just like MyPrivateProxy, their proxies are best only for web scraping and not Selenium automated testing.


How to Setup Proxies on Selenium

One of the problems developers have is how to setup proxies on Selenium. Because of the variety of browsers and programming language it supports, answers to questions like how to setup proxies vary.


Selenium proxy setting for Chrome browser

In this section of the article, we will look at how to setup Selenium to work with proxies driving the popular Chrome browser using Python. The below codes show how to setup proxies on Selenium. The code is for Chrome. code for setting Selenium proxy to drive Chrome

from selenium import webdriver

PROXY = "21.65.32.65:3124"

chrome_options = WebDriverWait.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % PROXY)

chrome = webdriver.Chrome(chrome_options=chrome_options)
chrome.get("https://whatismyipaddress.com")

Looking at the last line of the code, you can see that the code opens up the WhatIsMyIpAdress website, so you can see that Chrome is using your preferred proxy. Add Options,

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
 
 
ops = Options()
# ops.add_argument('--headless')
# ops.add_argument('--no-sandbox')
# ops.add_argument('--disable-dev-shm-usage')
# ops.add_argument('--disable-gpu')
print('--proxy-server=http://%s' % proxy)
ops.add_argument('--user-agent=%s' % ua)
ops.add_argument('--proxy-server=http://%s' % proxy)
driver = webdriver.Chrome(executable_path=r"/root/chromedriver", chrome_options=ops)
driver.delete_all_cookies()
driver.maximize_window()
 
driver.get("https://whatismyipaddress.com")
print(driver.page_source)
driver.quit()

For proxy mainly note here,

opt .add_argument("–proxy-server=http://ip:port") browser = webdriver.Chrome(chrome_options = opt )

Selenium proxy setting for firefox

code for setting Selenium proxy to drive Firefox Also, you can add options,

from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType
 
 
proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': my_proxy,
    'noProxy': ''
})
 
driver = webdriver.Firefox(proxy = proxy, executable_path=r"/root/geckodriver")
driver.delete_all_cookies()
driver.maximize_window()
 
driver.get("https://whatismyipaddress.com")
print(driver.page_source)
driver.quit()

Selenium private proxy setting

Need to authentication with username and password,

from selenium import webdriver
def create_proxyauth_extension(proxy_host, proxy_port,
                               proxy_username, proxy_password,
                               scheme='http', plugin_path=None):
    """Proxy Auth Extension
 
    args:
        proxy_host (str): domain or ip address, ie proxy.domain.com
        proxy_port (int): port
        proxy_username (str): auth username
        proxy_password (str): auth password
    kwargs:
        scheme (str): proxy scheme, default http
        plugin_path (str): absolute path of the extension      
 
    return str -> plugin_path
    """
    import string
    import zipfile
 
    if plugin_path is None:
        plugin_path = 'd:/webdriver/vimm_chrome_proxyauth_plugin.zip'
 
    manifest_json = """
    {
        "version": "1.0.0",
        "manifest_version": 2,
        "name": "Chrome Proxy",
        "permissions": [
            "proxy",
            "tabs",
            "unlimitedStorage",
            "storage",
            "",
            "webRequest",
            "webRequestBlocking"
        ],
        "background": {
            "scripts": ["background.js"]
        },
        "minimum_chrome_version":"22.0.0"
    }
    """
 
    background_js = string.Template(
    """
    var config = {
            mode: "fixed_servers",
            rules: {
              singleProxy: {
                scheme: "${scheme}",
                host: "${host}",
                port: parseInt(${port})
              },
              bypassList: ["foobar.com"]
            }
          };
 
    chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
 
    function callbackFn(details) {
        return {
            authCredentials: {
                username: "${username}",
                password: "${password}"
            }
        };
    }
 
    chrome.webRequest.onAuthRequired.addListener(
                callbackFn,
                {urls: [""]},
                ['blocking']
    );
    """
    ).substitute(
        host=proxy_host,
        port=proxy_port,
        username=proxy_username,
        password=proxy_password,
        scheme=scheme,
    )
    with zipfile.ZipFile(plugin_path, 'w') as zp:
        zp.writestr("manifest.json", manifest_json)
        zp.writestr("background.js", background_js)
 
    return plugin_path
 
proxyauth_plugin_path = create_proxyauth_extension(
    proxy_host="proxy.crawlera.com",
    proxy_port=8010,
    proxy_username="77409f72fe0c4a3e8413654411de0380",
    proxy_password=""
)
 
 
co = webdriver.ChromeOptions()
co.add_argument("--start-maximized")
co.add_extension(proxyauth_plugin_path)
 
 
driver = webdriver.Chrome(chrome_options=co)
driver.get("http://httpbin.org/get")

Crawlera as Smaple


Conclusion

Selenium is one of the tools available for automated testing, and web scraping JavaScript featured websites. Depending on what you require Selenium for, you might need to make use of proxies. the proxies discussed above are some of the best options available to you.

Top comments (0)