Introduction
Based on my experience, I have created multiple utilities to simplify Selenium for bot developers. First, let me describe these utilities to you, and then I will explain how to use them effectively.
Methods
1) get_by_current_page_referrer(link, wait=None)
:
An alternative to driver.get
is the get_by_current_page_referrer(link, wait=None)
utility. When using driver.get
, the document.referrer
property remains empty, indicating that you directly entered the page URL in the search bar, which can raise suspicion for bot detection systems.
By employing get_by_current_page_referrer(link, wait=None)
, you can simulate a visit that appears as if you arrived at the page by clicking a link. This approach creates a more natural and less detectable browsing behavior.
In general, when navigating to an internal page of a website, it is recommended to replace driver.get
with get_by_current_page_referrer
. Additionally, you have the option to specify the amount of time to wait before navigating, using the optional wait
parameter.
driver.get_by_current_page_referrer("https://example.com")
2) js_click(element)
:
While clicking elements with Selenium, elements can be intercepted by pop-ups, alerts, or other elements, leading to the raising of an ElementClickInterceptedException
error.
To handle such situations, you can utilize the js_click
method. This method enables you to click on an element using JavaScript, bypassing any interceptions from pop-ups or alerts.
By employing the js_click
method, you can ensure that the click operation is executed smoothly without being intercepted.
button = driver.get_element_or_none_by_selector(".button")
driver.js_click(button)
3) get_cookies_and_local_storage_dict()
:
This method returns a dictionary containing two keys, "cookies" and "local_storage", each of which contains a dictionary of the cookies and local storage. You can use them to persist session by storing them in a JSON file.
site_data = driver.get_cookies_and_local_storage_dict()
4) add_cookies_and_local_storage_dict(self, site_data)
:
This method adds both cookies and local storage data to the current web site.
site_data = {
"cookies": {"cookie1": "value1", "cookie2": "value2"},
"local_storage": {"name": "John", "age": 30}
}
driver.add_cookies_and_local_storage_dict(site_data)
5) organic_get link, wait=None)
:
This method follows a two-step process: it first loads the Google homepage and then navigates to the specified link. This approach closely resembles the way people typically visit websites, resulting in more humane behavior and reduces chances of bot being detected.
driver.organic_get("https://example.com")
6) local_storage
This property returns an instance of theย LocalStorage
ย class from theย bose.drivers.local_storage
ย module. This class is used for interacting with the browser's local storage in an easy to use manner.
local_storage = driver.local_storage
# Set an item in the Browser's Local Storage
local_storage.set_item('username', 'johndoe')
# Retrieve an item from the Browser's Local Storage
username = local_storage.get_item('username')
7) save_screenshot(filename=None)
Use this method to save a screenshot of the current web page to a file inย tasks/
ย directory. The filename of the screenshot is generated based on the current date and time, unless a custom filename is provided.
driver.save_screenshot()
8) short_random_sleep()
ย andย long_random_sleep()
:
These methods sleep for a random amount of time, either between 2 and 4 seconds (short) or between 6 and 9 seconds (long). You can use them like this:
driver.short_random_sleep()
driver.long_random_sleep()
9) get_element_or_none(xpath, wait=None)
,ย get_element_or_none_by_selector(selector, wait=None)
,ย get_element_by_id(id, wait=None)
,ย get_element_or_none_by_text_contains(text, wait=None)
,ย get_element_or_none_by_text(text, wait=None)
,ย get_element_or_none_by_name(selector, wait=None)
:
These methods find web elements on the page based on different criteria. They return the web element if it exists, orย None
ย if it doesn't. You can also pass number of seconds to wait for element to appear. You can use them like this
# find an element by xpath
element = driver.get_element_or_none("//div[@class='example']", 4)
# find an element by CSS selector
element = driver.get_element_or_none_by_selector(".example-class", 4)
# find an element by ID
element = driver.get_element_by_id("example-id", 4)
# find an element by text
element = driver.get_element_or_none_by_text("Example text", 4)
# find an element by partial text
element = driver.get_element_or_none_by_text_contains("Example", 4)
# find an element with attribute name = "email"
element = driver.get_element_or_none_by_name("email", 4)
10) is_in_page(target, wait=None, raise_exception=False)
This method checks if the browser is in the specified page. It will keep checking the URL for a specified amount of time (if wait is provided) and return False if the target is not found. If raise_exception is True, it will raise an exception if the page is not found.
is_in_page = driver.is_in_page('example.com', wait=10, raise_exception=True)
How to use these utilities?
The easiest way to use these utilities is to utilize the Bose Framework, which is the first framework designed to simplify Bot Development for Developers.
Bose Framework automatically incorporates these utility methods into the Selenium Driver. You can learn how to use the Bose Framework by following the tutorial at https://www.omkar.cloud/bose/docs/tutorial/.
Top comments (0)