Taking screenshots using Selenium can be an invaluable tool for debugging tests, capturing results, and documenting automated browser interactions. In this comprehensive guide, we'll explore the various methods available in Selenium to take screenshots with Python.
Overview of Taking Screenshots with Selenium
The Selenium WebDriver API provides several options for capturing screenshots:
save_screenshot()
– Saves a screenshot of the current page to a fileget_screenshot_as_file()
– Gets the screenshot of the current window as a binary dataget_screenshot_as_png()
– Returns the screenshot of the current window as a binary data in PNG formatget_screenshot_as_base64()
– Returns the screenshot of the current window as a base64 encoded string
These methods allow us to save screenshots in different formats and as binary data we can save to files. In addition, we can also take screenshots of specific elements on a page by first finding the element and then calling screenshot()
on the WebElement.
Let's explore some examples of using these methods for taking full page and element screenshots.
Taking Full Page Screenshots
Taking a full page screenshot is straightforward with the save_screenshot()
method.
from selenium import webdriver driver = webdriver.Chrome() # Navigate to page driver.get("http://www.python.org") # Save screenshot to file driver.save_screenshot('python_home.png')
This will save the complete screenshot of the current window to the specified file path. The save_screenshot()
the method also takes an optional path
argument to directly save the screenshot to a file.
# Save screenshot directly to file driver.save_screenshot('/tmp/home.png')
Instead of saving the screenshot to a file, we can also get the screenshot as a binary data in memory using get_screenshot_as_file()
.
# Get screenshot as binary data screenshot = driver.get_screenshot_as_file()
This returns the screenshot as a binary file object we can manipulate further in Python. To get the screenshot as a string in the PNG format, we can use get_screenshot_as_png()
:
# Get screenshot as PNG screenshot_png = driver.get_screenshot_as_png()
To get a base64 encoded string representation of the screenshot, we can use get_screenshot_as_base64()
:
# Get screenshot as base64 screenshot_b64 = driver.get_screenshot_as_base64()
The get_screenshot_*
methods provide the screenshot as an in-memory binary data we can then save to a file manually if required.
# Save screenshot to file from memory with open("screenshot.png", "wb") as fd: fd.write(driver.get_screenshot_as_png())
This allows more flexibility in processing the screenshot before saving.
Taking Element Screenshots
In addition to full-page screenshots, we can also take screenshots of specific elements on the page. To do this, we first need to find the target element using any of the element location strategies like CSS selector, XPath etc. We then call screenshot()
on the WebElement to capture only that element.
For example:
from selenium.webdriver.common.by import By # Find element element = driver.find_element(By.CSS_SELECTOR, "#logo") # Take screenshot of element element_screenshot = element.screenshot("element.png")
This will save the screenshot of just the element matched by the CSS selector #logo
. We can use the other get_screenshot_*
methods here as well to get the element screenshot as an in-memory binary data.
# Get element screenshot in memory element_png = element.screenshot_as_png
Saving element screenshots allows us to capture parts of a page instead of the entire viewport. This is useful for capturing specific components or sections of a page.
Setting Screenshot File Type
By default, Selenium will save screenshots in the PNG format. We can configure the file format using the screenshot_as_base64
capability when creating the WebDriver instance:
from selenium import webdriver options = webdriver.ChromeOptions() options.add_argument("screenshot.type=jpeg") driver = webdriver.Chrome(chrome_options=options)
This will save screenshots as JPEG files instead of PNG. The supported formats include:
- PNG (default)
- JPEG
- BMP
Screenshot quality can also be controlled when using JPEG via screenshot.quality
:
options.add_argument("screenshot.quality=100")
Managing Screenshot File Name
By default, Selenium will save screenshots with a filename like screenshot_1.png
. We can specify a custom filename pattern using screenshot_as_base64
:
options.add_argument("screenshot.prefix=my_screenshot_")
Now screenshots will be saved as my_screenshot_1.png
, my_screenshot_2.png
etc. This can help identify and manage screenshots, especially when taking multiple screenshots in a test run.
Delaying Screenshots
A key consideration when taking screenshots is timing. Since Selenium executes faster than a normal user, screenshots can sometimes be taken before all elements are fully rendered on the page. This results in incomplete or unstable screenshots.
To account for this, we need to build in delays before taking screenshots to allow the page to load completely. Some ways to do this:
Implicit Wait
Set an implicit wait on the driver to make it wait before throwing errors on not finding elements:
driver.implicitly_wait(10)
Time.sleep()
Insert a sleep delay before taking the screenshot:
from time import sleep sleep(5) driver.save_screenshot('screenshot.png')
WebDriverWait
Use explicit waits to wait for elements to be present before taking screenshot:
from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "myElement"))) driver.save_screenshot('screenshot.png')
This will wait up to 10 seconds for the element to be present before taking the screenshot.
Page Load Timeout
Set a page load timeout on the driver to allow time for dynamic page content:
driver.set_page_load_timeout(10)
The page load timeout will wait for the full page including AJAX content to load before raising a TimeoutException.
Using waits and timeouts helps ensure all page elements are loaded properly before capturing screenshots.
Scrolling Before Taking Screenshots
For long web pages, we may need to scroll to the portion of the page we want to screenshot. Selenium provides a way to scroll to an element before taking its screenshot using execute_script()
:
element = driver.find_element(By.ID, "bottom_element") # Scroll element into view driver.execute_script("arguments[0].scrollIntoView();", element) # Screenshot element_screenshot = element.screenshot("element.png")
This scrolls the element into the current viewport before taking the screenshot. We can also scroll by a specific amount:
# Scroll down 500 px driver.execute_script("window.scrollBy(0, 500)")
And for full page screenshots, we may need to scroll to the bottom of the page first:
# Scroll to bottom driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") driver.save_screenshot('fullpage.png')
Scrolling to position elements or full page height helps fully capture all content in the screenshot.
Setting Viewport Size
The viewport size can impact how much of a page is captured in a screenshot. We may need to explicitly set the browser window size before taking screenshots to get complete page content. To set the viewport size in Selenium:
driver.set_window_size(1920, 1080)
This will resize the browser to the specified dimensions before taking screenshots. For responsive sites, we can iterate through different viewport sizes to generate screenshots for different devices:
mobile_size = (360, 640) desktop_size = (1024, 768) driver.set_window_size(*mobile_size) driver.save_screenshot('mobile.png') driver.set_window_size(*desktop_size) driver.save_screenshot('desktop.png')
Headless Mode
When running Selenium in headless mode, we may need to explicitly set a viewport size since there is no actual browser window. For example:
from selenium.webdriver.chrome.options import Options options = Options() options.headless = True options.add_argument("--window-size=1920,1080") driver = webdriver.Chrome(options=options)
This ensures the headless browser rendering matches the specified viewport size.
Capturing Full Page Screenshots
By default, Selenium will only capture the current viewport when taking screenshots. To capture the entire page length, including content outside the viewport, we need to stitch together screenshots from different scroll positions. Here is an example function to do full-page screenshots:
import time from selenium import webdriver def fullpage_screenshot(driver, file): """ Takes a screenshot of the entire page by scrolling and stitching it together. """ total_height = driver.execute_script("return document.body.scrollHeight") viewport_height = driver.execute_script("return window.innerHeight") # Scroll the page, take screenshots and combine them screenshots = [] positions = [0] # Positions to scroll to height = 0 while height < total_height: for position in positions: driver.execute_script(f"window.scrollTo(0, {position});") time.sleep(0.5) screenshots.append(driver.get_screenshot_as_png()) height += viewport_height positions.append(height) # Stitch images together stitched_image = stitch_images(screenshots) # Save stitched image to file with open(file, 'wb') as fd: fd.write(stitched_image) def stitch_images(images): # Returns a stitched image from a list of images pass
This scrolls the page in steps, taking screenshots at each position, and then stitches them together into a single tall screenshot, capturing the full page height. The key considerations are correctly calculating the scrollable height and incrementing scroll positions to fill the entire page length.
Debugging with Selenium Screenshots
One of the most useful applications of screenshots is for debugging Selenium scripts. We can take screenshots at strategic points and log them along with other metadata to help diagnose flaky tests or strange runtime behaviors.
For example:
try: # Test steps except Exception as e: # Take screenshot driver.save_screenshot('error_screenshot.png') # Log screenshot print("Screenshot saved with error:", e)
This takes and logs a screenshot when our test hits an error.
We can take screenshots at the start and end of critical test steps, interactions with important elements, or when we want to visually verify the application state at a given point. Screenshots provide a snapshot of exactly what the browser sees at that moment. This is invaluable for troubleshooting and inspecting test failures.
Some best practices for using screenshots to debug tests:
- Take screenshots before and after interacting with key elements
- Capture screenshots after critical test steps like sign in, checkout etc.
- Log screenshots along with detailed debugging information like exceptions, timestamps, browser logs etc.
- Screenshot element state like disabled buttons, empty inputs, overlay modals etc. which help diagnose issues.
- Compare screenshots before and after an action to identify visual regressions.
- Review screenshots in a visual regression testing tool like Applitools, Percy, Wraith etc.
- Store screenshots with descriptive names indicating test name, step number, timestamp etc. for easy lookup.
Debugging with strategic screenshots helps understand and pinpoint test failures faster.
Handling Cross-Origin Screenshot Restrictions
Many sites employ cross-origin protections that block taking screenshots from code on another domain. When taking screenshots on such sites, we may see errors like:
DOMException: Failed to execute 'toDataURL' on 'HTMLCanvasElement': Tainted canvases may not be exported.
To workaround this, we need to run Selenium in the same origin as the site by hosting it on the same domain. For example, hosting Selenium on the site's own servers or infrastructure. Another option is to use proxy tools like browless or Chromeless to take screenshots. These run headless Chrome in the cloud and can bypass cross-origin restrictions.
Finally, some sites may implement screenshot protection via headers like X-Frame-Options. In this case, there is no reliable workaround and screenshots may not be feasible.
Automating and Managing Browser Screenshots
Taking one-off screenshots for debugging is useful, but we often need a more managed approach for things like:
- Generating screenshots across multiple tests
- Capturing screenshots on test failures
- Storing and organizing screenshots
- Comparing screenshots across runs
- Integrating screenshots with CI/CD systems
Some tips for effectively managing screenshots in test automation:
- Create utility wrapper methods for common screenshot operations to standardize screenshot code.
- Generate automatic filenames with details like test name, timestamp, browser etc.
- Configure screenshots as artifacts in your CI/CD system. Many tools like Jenkins allow archiving test screenshots.
- Use a screenshot testing tool like Applitools to organize, compare and manage screenshots.
- Store screenshots in cloud storage like S3 buckets to access across different environments.
- Track screenshots along with metadata like test results, environment details etc. in your automation reporting dashboard.
- Implement screenshot diffing to compare current screenshots against baseline approved images to identify regressions.
Automating screenshot management ensures they provide ongoing value across the entire test process.
Advanced Uses for Selenium Screenshots
Beyond debugging, screenshots open up some additional useful possibilities:
- Visual Testing – Screenshots can be used for visual validation to check for broken UI, shifts in layout, styling issues etc. Tools like Applitools allow automating visual UI tests.
- Page Layout Testing – Multi-browser screenshots can help validate responsiveness across viewports and identify layout bugs.
- Documentation – Screenshots can auto-generate documentation showing application states and workflows.
- Monitoring – Screenshots can assist in monitoring and alerting by capturing screenshots of errors, application outages, or performance issues.
- A/B Testing – Screenshots allow comparing UI variations in A/B tests to evaluate design changes.
- Computer Vision – Screenshots enable applying computer vision for tasks like OCR, text extraction, and image analysis.
- Presentation Testing – For highly visual apps, pixel-level screenshot diffs can verify precise rendering of colors, images, graphs etc.
- Accessibility Testing – Screenshots can assist in auditing contrast ratios, color palettes, and other accessibility criteria.
- Animated GIFs – Taking a rapid sequence of screenshots allows building animated GIFs to document workflows and test cases.
So screenshots are not just limited to debugging – they can also provide rich visual data for many other test automation use cases.
Conclusion
Taking a screenshot with Selenium is a valuable technique for capturing the state of a webpage during automated testing. By leveraging the WebDriver's built-in screenshot functionality, testers can programmatically capture and save visual evidence of their tests, which is crucial for debugging and verifying the UI of web applications.
This process can be implemented with just a few lines of code across various programming languages that support Selenium, making it an accessible and essential tool in the quality assurance process. Whether it's for capturing the entire page or just a specific element, screenshots can provide a fast and accurate visual confirmation that the web application functions as expected across different environments and scenarios.