DEV Community

Creating an API that runs Selenium via AWS Lambda

Jaira Encio on June 11, 2021

Being an automation tester, my job is to automate everything. As I was running my test script via terminal I realised that Iā€™m the only who can exe...
Collapse
 
prenitwankhede profile image
prenit-wankhede

Thanks a ton brother for simple yet elegant walk through.
I have been trying so many tutorials and ways to get it to work but no luck.

With selenium version, chromedriver version and headless-chrome version as mentioned in the post, finally got it working. Thanks a bunch !

Collapse
 
awolad profile image
Awolad Hossain

@jairaencio It's working great. But I can't use the selenium-stealth plugin. Getting an error. Message: unknown error: Chrome failed to start: exited abnormally\n (Driver info: .....

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium_stealth import stealth

def lambda_handler(event, context):
    options = Options()
    options.binary_location = '/opt/headless-chromium'    
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('useAutomationExtension', False)

    driver = webdriver.Chrome('/opt/chromedriver',chrome_options=options)

    stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )

    driver.get('https://quizlet.com/446134722/it-management-flash-cards/')

    driver.close();
    driver.quit();

    response = {
        "statusCode": 200,
        "body": "Selenium Headless Chrome Initialized"
    }

    return response
Enter fullscreen mode Exit fullscreen mode
Collapse
 
jairaencio profile image
Jaira Encio

Hi @awolad I think you need to include the stealth library package in your lambda layer. Notice in my tutorial I have 2 different lambda layers for my selenium and chromedriver package. You can create another lambda layer or just simply include it in the 2 layers

Collapse
 
awolad profile image
Awolad Hossain

@jairaencio Yes, I've added the stealth library package in the selenium lambda layer. There is no import error.

Thread Thread
 
jairaencio profile image
Jaira Encio

Great! Always happy to help :)

Thread Thread
 
awolad profile image
Awolad Hossain

@jairaencio Sorry, It's not solved yet. I mean the error is not related to the import the stealth package issue. Because the package is already in my lambda layer. The driver fails to load when the stealth package is used.

Thread Thread
 
jairaencio profile image
Jaira Encio • Edited

does the error only occur when you add selenium-stealth library? Upon checking I noticed that others are experiencing issue in their local machines just by using stealth. You could try adding options.add_argument("--disable-blink-features=AutomationControlled") . Then try if it works both on your local and lambda.

Thread Thread
 
awolad profile image
Awolad Hossain

Yes.

With the selenium-stealth default options like following:

options = Options()
options.binary_location = '/opt/headless-chromium'
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
# options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)

driver = webdriver.Chrome('/opt/chromedriver', chrome_options=options)

stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )
Enter fullscreen mode Exit fullscreen mode

I'm getting error: Message: unknown error: Chrome failed to start: exited abnormally\n (Driver info: .....

By using this post options like following:

options = Options()
options.binary_location = '/opt/headless-chromium'
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')

driver = webdriver.Chrome('/opt/chromedriver', chrome_options=options)

stealth(driver,
     languages=["en-US", "en"],
     vendor="Google Inc.",
      platform="Win32",
      webgl_vendor="Intel Inc.",
      renderer="Intel Iris OpenGL Engine",
       fix_hairline=True,
)
Enter fullscreen mode Exit fullscreen mode

I'm getting error: "'WebDriver' object has no attribute 'execute_cdp_cmd'"

Thread Thread
 
jairaencio profile image
Jaira Encio

I'm seeing this article related to "execute_cdp_cmd" error. Apparently they used pip install --pre selenium to be able to execute CDP commands github.com/SeleniumHQ/selenium/iss...

Thread Thread
 
awolad profile image
Awolad Hossain

I also tried that but not working. I forgot to mention that. It would be helpful for us if you try with the selenium-stealth package and update this post. Because some websites we can't scrape without the selenium-stealth package. Thanks!

Collapse
 
da_shen_a7cf582bace0b4404 profile image
Da Shen

tested working.. good article. note that Python runtime has to be 3.6. It won't work otherwise.

Collapse
 
achimgrolimund profile image
Achim Grolimund • Edited

Hey @jairaencio
Hello everyone, thanks for the guide. but it seems to me that this method no longer works without a Dockerimage.

I keep getting the error:

{
  "errorMessage": "Message: Service /opt/chromedriver unexpectedly exited. Status code was: 127\n",
  "errorType": "WebDriverException",
  "requestId": "ef0d3b0d-ee3f-4822-ba11-9dd40920680b",
  "stackTrace": [
    "  File \"/var/task/app.py\", line 29, in main\n    driver = webdriver.Chrome(service=s, options=op)\n",
    "  File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/chrome/webdriver.py\", line 69, in __init__\n    super().__init__(DesiredCapabilities.CHROME['browserName'], \"goog\",\n",
    "  File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/chromium/webdriver.py\", line 89, in __init__\n    self.service.start()\n",
    "  File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/common/service.py\", line 98, in start\n    self.assert_process_still_running()\n",
    "  File \"/opt/python/lib/python3.9/site-packages/selenium/webdriver/common/service.py\", line 110, in assert_process_still_running\n    raise WebDriverException(\n"
  ]
}
Enter fullscreen mode Exit fullscreen mode

Python 3.9
Selenium 4.5.0
chromedriver 106.0.5249.61
headless-chromium v1.0.0-57

This is My Makefile to create all the stuff i need. At the end i do upload all what is inside the lambda Folder (Code + 2 Layers)

BOLD := \033[1m
NORMAL := \033[0m
GREEN := \033[1;32m

.DEFAULT_GOAL := help
HELP_TARGET_DEPTH ?= \#
.PHONY: help
help: # Show how to get started & what targets are available
    @printf "This is a list of all the make targets that you can run, e.g. $(BOLD)make dagger$(NORMAL) - or $(BOLD)m dagger$(NORMAL)\n\n"
    @awk -F':+ |$(HELP_TARGET_DEPTH)' '/^[0-9a-zA-Z._%-]+:+.+$(HELP_TARGET_DEPTH).+$$/ { printf "$(GREEN)%-20s\033[0m %s\n", $$1, $$3 }' $(MAKEFILE_LIST) | sort
    @echo


install: clean # Format all files (terraform fmt --recursive .)
    python3.9 -m pip install -t python/lib/python3.9/site-packages selenium==4.5.0 --upgrade
    python3.9 -m pip install -t python/lib/python3.9/site-packages wget --upgrade
    curl -SL https://chromedriver.storage.googleapis.com/106.0.5249.61/chromedriver_linux64.zip > chromedriver.zip
    curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-57/stable-headless-chromium-amazonlinux-2.zip > headless-chromium.zip
    mkdir -p lambda
    zip -9 -r lambda/python.zip python
    unzip chromedriver.zip
    unzip headless-chromium.zip
    zip -9 -r lambda/chromedriver.zip chromedriver headless-chromium
    rm -f chromedriver.zip headless-chromium.zip chromedriver headless-chromium

build: # Create all layers to upload to the Lambda Function
    zip -9 -r lambda/app.zip app.py

clean:
    rm -rf lambda python
Enter fullscreen mode Exit fullscreen mode

And here is a overview of my python script:

import re
import wget

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait


def main(event, context):
    regex = r"[ \w-]+?(?=\.txt\.gz)"

    s = Service('/opt/chromedriver')
    op = webdriver.ChromeOptions()
    op.binary_location = '/opt/headless-chromium'
    op.add_argument('--headless')
    op.add_argument('--no-sandbox')
    op.add_argument('--disable-dev-shm-usage')
    op.add_argument('--disable-gpu')
    op.add_argument('--disable-dev-tools')
    op.add_argument("--disable-extensions")
    op.add_argument('--no-zygote')
    op.add_argument('--single-process')
    op.add_argument('--enable-logging')
    op.add_argument('--log-level=0')
    op.add_argument("--disable-notifications")
    op.add_argument('--v=99')
    driver = webdriver.Chrome(service=s, options=op)

    driver.get(
        "https://xxxxxx")

    (.....)

    driver.close()

Enter fullscreen mode Exit fullscreen mode

Is there another way to run it directly in a lambda without a docker image?

Best Regards

Collapse
 
jairaencio profile image
Jaira Encio

Great take on this achinšŸ’Ŗ I still havent gotten back on this but your method of using dockerimage is very useful for everyone as well šŸ‘šŸ½

Collapse
 
roccolocko profile image
Rocco

I have the configuration propose on the edited version of the article but I get the same error. I created all the version on an amazon linux using the python3.9 command like in the example but I keep getting the same error.

Did you find any solution?

Collapse
 
achimgrolimund profile image
Achim Grolimund

Yes it needs an special tool inside the docker image. I wil post it here today as soon as im on my computer

Thread Thread
 
nadiaou profile image
Nadia-Ou

did you find any solution, please?

Collapse
 
rajans163 profile image
rajans163

Hi Dear...were you able to get resolution of this issue. I am getting the same issue in my lambda.
Please share the resolution as I am completely stuck.

Collapse
 
quibski profile image
quibski

Surely many Devs and QA will benefit from this. Hopefully a demo can be made/shown

Collapse
 
tchua profile image
tchua

+1 to this, a demo would be great!

Collapse
 
jairaencio profile image
Jaira Encio

uhm hahaha

Collapse
 
chris93007wq profile image
Christine John

I'm getting the following error -
selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary
(Driver info: chromedriver=2.37.544315(730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Linux 4.14.255-276-224.499.amzn2.x86_64 x86_64)

Could someone help me please? :(

Collapse
 
awspipe profile image
awspipe

Hi I am new to AWS lambda, so kindly apologize for any obvious questions...
I was able to create the 2 layers mentioned by you, for python and chromium.
But I have no idea how can I run serverless.yaml...
You've mentioned "sls deploy", but I don't have linux to run this.

Any other alternative to run this? Thanks

Collapse
 
jairaencio profile image
Jaira Encio

I'm not a windows user but here is what I found codegrepper.com/code-examples/shel...

Collapse
 
chris93007wq profile image
Christine John
Collapse
 
jairaencio profile image
Jaira Encio

Hi, this error would likely happen if the location for your chromedriver is incorrect. (lambda layer)

Collapse
 
sebifc profile image
sebifc

Hi! I tried to run it in a Lambda function and the execution keeps running until it get the 600 seconds timeout.
I had downloaded the chromium 93 driver and the headless chrome 93 as well.

I'm using the following code:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from webdriver_manager.chrome import ChromeDriverManager
from selenium_stealth import stealth

def lambda_handler(event, context):
    options = Options()
    options.binary_location = '/opt/headless-chromium'
    options.add_argument("start-maximized")
    #options.add_experimental_option("excludeSwitches", ["enable-automation"])
    #options.add_experimental_option('useAutomationExtension', False)
    #options.add_argument("--disable-blink-features=AutomationControlled")
    #options.add_argument('--headless')
    options.add_argument('--no-sandbox')
    options.add_argument('--single-process')
    options.add_argument('--disable-dev-shm-usage')
    options.add_argument('--disable-notifications')
    options.add_argument("--enable-javascript")

    driver = webdriver.Chrome('/opt/chromedriver',chrome_options=options)

    stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )

    driver.get('https://trends.google.com/trends/trendingsearches/daily?geo=US')

    driver.close();
    driver.quit();

    response = {
        "statusCode": 200,
        "body": "Selenium Headless Chrome Initialized"
    }

    return response
Enter fullscreen mode Exit fullscreen mode
Collapse
 
jairaencio profile image
Jaira Encio

I'm guessting that the timeout issue is caused by deployment package size. You would have to use container images solve it.

Collapse
 
chrisjeriel profile image
chrisjeriel

Great work! A well-thought-out article, straightforward and concise. Looking forward more advanced implementations.

Collapse
 
cesarcastmore profile image
cesarcastmore • Edited

Hello! .

I just have one question. When I was uploading my image to Lambda, I noticed that it required a lot of memory, and I think that could significantly increase costs. What are the differences between layers and Docker in Lambda?

I was following this documentation and managed to do it for version 3.7, but I was unsuccessful with version 3.9. Some of the comments below suggest using Docker, but I realized that using Docker requires increasing the Lambda's memory. Do you know of any alternatives to using Docker that won't consume a lot of memory and increase costs

Collapse
 
jairaencio profile image
Jaira Encio

As of now there is nothing else other than docker to work with this. I dont recommend setting up an instance as well. We can only hope that aws improves their lambda pricing and memory.

Collapse
 
rajans163 profile image
rajans163

@jairaencio

I tried the same stuff for Python3.8(same chrome driver as you did for 3.9) but got the below error. Please help

START RequestId: ce98c274-1b51-46a5-968e-cdabe1e08a2a Version: $LATEST
[ERROR] WebDriverException: Message: Service /opt/chromedriver unexpectedly exited. Status code was: 127

Traceback (most recent call last):
Ā Ā File "/var/task/lambda_function.py", line 13, in lambda_handler
Ā Ā Ā Ā driver = webdriver.Chrome('/opt/chromedriver')
Ā Ā File "/opt/python/selenium/webdriver/chrome/webdriver.py", line 69, in init
Ā Ā Ā Ā super().init(DesiredCapabilities.CHROME['browserName'], "goog",
Ā Ā File "/opt/python/selenium/webdriver/chromium/webdriver.py", line 89, in init
Ā Ā Ā Ā self.service.start()
Ā Ā File "/opt/python/selenium/webdriver/common/service.py", line 98, in start
Ā Ā Ā Ā self.assert_process_still_running()
Ā Ā File "/opt/python/selenium/webdriver/common/service.py", line 110, in assert_process_still_running
Ā Ā Ā Ā raise WebDriverException(END RequestId: ce98c274-1b51-46a5-968e-cdabe1e08a2a
REPORT RequestId: ce98c274-1b51-46a5-968e-cdabe1e08a2a Duration: 610.08 ms Billed Duration: 611 ms Memory Size: 128 MB Max Memory Used: 47 MB Init Duration: 253.40 ms

Collapse
 
jairaencio profile image
Jaira Encio

dev.to/achimgrolimund/comment/22d99 hi, you might want to follow this using dockerimage

Collapse
 
shitalchinte profile image
shitalChinte

@jairaencio Great blog for helping QA engineers. Can you please guide on how can we make it compatible with other python versions. I tried with 3.9 and it fails with below error.
"errorMessage": "Unable to import module 'lambda_function': No module named 'selenium'",

Thanks in advance.

Collapse
 
jairaencio profile image
Jaira Encio

Hi, i just also recently noticed the deprecation update on lambda. Might have to change everything starting from drivers. Will try my best to update post.

Collapse
 
youngcto profile image
youngcto

Live Demo here: youtu.be/qIcVGDEjtt4?t=3482 (part of the June Meetup)

Collapse
 
em__sia profile image
e

Naks! Great job! Screenshots will be helpful too. And a milk-tea will do. Thanks! Ahahahhahaha. :)

Collapse
 
ranzeyxc profile image
ranzey

Huge help! :)

Collapse
 
ivy07 profile image
Ivy ā˜•

Great help! Will definitely need more articles like this un the future.

Collapse
 
rolinj profile image
rolinj

Kudos Jai! Great tutorial indeed!
For reference, may you add as well some screenshots of the created cloudformation stack and s3 bucket on the output section? Thanks :D

Collapse
 
jairaencio profile image
Jaira Encio

Uploaded screenshots of cloudformation and s3 bucket. Thanks for the feedback :)

Collapse
 
jlgarcia profile image
jltuts

Good job jai! Very helpful!

Collapse
 
lunchcodes profile image
LunchCodes

Exactly what I needed! Thank you!

Collapse
 
raphael_jambalos profile image
Raphael Jambalos

The article is very helpful! It brings automation to the next level. By having running automated tests in a more automated way, developers will be empowered to make sure their code runs optimally.

Collapse
 
chiggy_wiggy profile image
Maynard Prepotente • Edited

Great Job! Will definitely help a lot of Devs and QA! Adding screenshots of the output will make the job easier tho =)

Collapse
 
hydewyvern profile image
Hyde Wyvern

I'm really new to Lambda, does anyone know how could I adapt this process to run with node 16.x instead of python?

Collapse
 
jairaencio profile image
Jaira Encio

You can select node on compatible runtimes however you would need nodejs language bindings installed in your layer directory. My selenium layer was compatible with python so you would have to find a same one compatible with node.

Collapse
 
silencer017 profile image
silencer017

Well done, this is very informative!

Collapse
 
jobad profile image
Badjo Badiola

This is AWSome! thank you!