How to Take a Screenshot with Selenium?

Taking screenshots using Selenium can be an invaluable tool for debugging tests, capturing results, and documenting automated browser interactions. In this comprehensive guide, we'll explore the various methods available in Selenium to take screenshots with Python.

Overview of Taking Screenshots with Selenium

The Selenium WebDriver API provides several options for capturing screenshots:

save_screenshot() – Saves a screenshot of the current page to a file
get_screenshot_as_file() – Gets the screenshot of the current window as a binary data
get_screenshot_as_png() – Returns the screenshot of the current window as a binary data in PNG format
get_screenshot_as_base64() – Returns the screenshot of the current window as a base64 encoded string

These methods allow us to save screenshots in different formats and as binary data we can save to files. In addition, we can also take screenshots of specific elements on a page by first finding the element and then calling screenshot() on the WebElement.

Let's explore some examples of using these methods for taking full page and element screenshots.

Taking Full Page Screenshots

Taking a full page screenshot is straightforward with the save_screenshot() method.

from selenium import webdriver

driver = webdriver.Chrome()

# Navigate to page
driver.get("http://www.python.org")

# Save screenshot to file 
driver.save_screenshot('python_home.png')

This will save the complete screenshot of the current window to the specified file path. The save_screenshot() the method also takes an optional path argument to directly save the screenshot to a file.

# Save screenshot directly to file
driver.save_screenshot('/tmp/home.png')

Instead of saving the screenshot to a file, we can also get the screenshot as a binary data in memory using get_screenshot_as_file().

# Get screenshot as binary data
screenshot = driver.get_screenshot_as_file()

This returns the screenshot as a binary file object we can manipulate further in Python. To get the screenshot as a string in the PNG format, we can use get_screenshot_as_png():

# Get screenshot as PNG
screenshot_png = driver.get_screenshot_as_png()

To get a base64 encoded string representation of the screenshot, we can use get_screenshot_as_base64():

# Get screenshot as base64 
screenshot_b64 = driver.get_screenshot_as_base64()

The get_screenshot_* methods provide the screenshot as an in-memory binary data we can then save to a file manually if required.

# Save screenshot to file from memory
with open("screenshot.png", "wb") as fd:
    fd.write(driver.get_screenshot_as_png())

This allows more flexibility in processing the screenshot before saving.

Taking Element Screenshots

In addition to full-page screenshots, we can also take screenshots of specific elements on the page. To do this, we first need to find the target element using any of the element location strategies like CSS selector, XPath etc. We then call screenshot() on the WebElement to capture only that element.

For example:

from selenium.webdriver.common.by import By

# Find element
element = driver.find_element(By.CSS_SELECTOR, "#logo")

# Take screenshot of element
element_screenshot = element.screenshot("element.png")

This will save the screenshot of just the element matched by the CSS selector #logo. We can use the other get_screenshot_* methods here as well to get the element screenshot as an in-memory binary data.

# Get element screenshot in memory
element_png = element.screenshot_as_png

Saving element screenshots allows us to capture parts of a page instead of the entire viewport. This is useful for capturing specific components or sections of a page.

Setting Screenshot File Type

By default, Selenium will save screenshots in the PNG format. We can configure the file format using the screenshot_as_base64 capability when creating the WebDriver instance:

from selenium import webdriver 

options = webdriver.ChromeOptions()
options.add_argument("screenshot.type=jpeg")

driver = webdriver.Chrome(chrome_options=options)

This will save screenshots as JPEG files instead of PNG. The supported formats include:

PNG (default)
JPEG
BMP

Screenshot quality can also be controlled when using JPEG via screenshot.quality:

options.add_argument("screenshot.quality=100")

Managing Screenshot File Name

By default, Selenium will save screenshots with a filename like screenshot_1.png. We can specify a custom filename pattern using screenshot_as_base64:

options.add_argument("screenshot.prefix=my_screenshot_")

Now screenshots will be saved as my_screenshot_1.png, my_screenshot_2.png etc. This can help identify and manage screenshots, especially when taking multiple screenshots in a test run.

Delaying Screenshots

A key consideration when taking screenshots is timing. Since Selenium executes faster than a normal user, screenshots can sometimes be taken before all elements are fully rendered on the page. This results in incomplete or unstable screenshots.

To account for this, we need to build in delays before taking screenshots to allow the page to load completely. Some ways to do this:

Implicit Wait

Set an implicit wait on the driver to make it wait before throwing errors on not finding elements:

driver.implicitly_wait(10)

Time.sleep()

Insert a sleep delay before taking the screenshot:

from time import sleep

sleep(5)
driver.save_screenshot('screenshot.png')

WebDriverWait

Use explicit waits to wait for elements to be present before taking screenshot:

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait 
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "myElement")))

driver.save_screenshot('screenshot.png')

This will wait up to 10 seconds for the element to be present before taking the screenshot.

Page Load Timeout

Set a page load timeout on the driver to allow time for dynamic page content:

driver.set_page_load_timeout(10)

The page load timeout will wait for the full page including AJAX content to load before raising a TimeoutException.

Using waits and timeouts helps ensure all page elements are loaded properly before capturing screenshots.

Scrolling Before Taking Screenshots

For long web pages, we may need to scroll to the portion of the page we want to screenshot. Selenium provides a way to scroll to an element before taking its screenshot using execute_script():

element = driver.find_element(By.ID, "bottom_element")

# Scroll element into view  
driver.execute_script("arguments[0].scrollIntoView();", element)

# Screenshot    
element_screenshot = element.screenshot("element.png")

This scrolls the element into the current viewport before taking the screenshot. We can also scroll by a specific amount:

# Scroll down 500 px  
driver.execute_script("window.scrollBy(0, 500)")

And for full page screenshots, we may need to scroll to the bottom of the page first:

# Scroll to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

driver.save_screenshot('fullpage.png')

Scrolling to position elements or full page height helps fully capture all content in the screenshot.

Setting Viewport Size

The viewport size can impact how much of a page is captured in a screenshot. We may need to explicitly set the browser window size before taking screenshots to get complete page content. To set the viewport size in Selenium:

driver.set_window_size(1920, 1080)

This will resize the browser to the specified dimensions before taking screenshots. For responsive sites, we can iterate through different viewport sizes to generate screenshots for different devices:

mobile_size = (360, 640)
desktop_size = (1024, 768)

driver.set_window_size(*mobile_size)
driver.save_screenshot('mobile.png')

driver.set_window_size(*desktop_size)  
driver.save_screenshot('desktop.png')

Headless Mode

When running Selenium in headless mode, we may need to explicitly set a viewport size since there is no actual browser window. For example:

from selenium.webdriver.chrome.options import Options

options = Options()
options.headless = True 
options.add_argument("--window-size=1920,1080")

driver = webdriver.Chrome(options=options)

This ensures the headless browser rendering matches the specified viewport size.

Capturing Full Page Screenshots

By default, Selenium will only capture the current viewport when taking screenshots. To capture the entire page length, including content outside the viewport, we need to stitch together screenshots from different scroll positions. Here is an example function to do full-page screenshots:

import time
from selenium import webdriver

def fullpage_screenshot(driver, file):
  """ Takes a screenshot of the entire page by  
      scrolling and stitching it together. """

  total_height = driver.execute_script("return document.body.scrollHeight")  
  viewport_height = driver.execute_script("return window.innerHeight")

  # Scroll the page, take screenshots and combine them
  screenshots = []
  positions = [0] # Positions to scroll to

  height = 0
  while height < total_height:
      for position in positions:
          driver.execute_script(f"window.scrollTo(0, {position});")
          time.sleep(0.5)
          screenshots.append(driver.get_screenshot_as_png())
          height += viewport_height
          positions.append(height)

  # Stitch images together        
  stitched_image = stitch_images(screenshots)

  # Save stitched image to file
  with open(file, 'wb') as fd:
      fd.write(stitched_image)
      
def stitch_images(images):
    # Returns a stitched image from a list of images
    pass

This scrolls the page in steps, taking screenshots at each position, and then stitches them together into a single tall screenshot, capturing the full page height. The key considerations are correctly calculating the scrollable height and incrementing scroll positions to fill the entire page length.

Debugging with Selenium Screenshots

One of the most useful applications of screenshots is for debugging Selenium scripts. We can take screenshots at strategic points and log them along with other metadata to help diagnose flaky tests or strange runtime behaviors.

For example:

try: 
   # Test steps
except Exception as e:
   # Take screenshot   
   driver.save_screenshot('error_screenshot.png')
   
   # Log screenshot
   print("Screenshot saved with error:", e)

This takes and logs a screenshot when our test hits an error.

We can take screenshots at the start and end of critical test steps, interactions with important elements, or when we want to visually verify the application state at a given point. Screenshots provide a snapshot of exactly what the browser sees at that moment. This is invaluable for troubleshooting and inspecting test failures.

Some best practices for using screenshots to debug tests:

Take screenshots before and after interacting with key elements
Capture screenshots after critical test steps like sign in, checkout etc.
Log screenshots along with detailed debugging information like exceptions, timestamps, browser logs etc.
Screenshot element state like disabled buttons, empty inputs, overlay modals etc. which help diagnose issues.
Compare screenshots before and after an action to identify visual regressions.
Review screenshots in a visual regression testing tool like Applitools, Percy, Wraith etc.
Store screenshots with descriptive names indicating test name, step number, timestamp etc. for easy lookup.

Debugging with strategic screenshots helps understand and pinpoint test failures faster.

Handling Cross-Origin Screenshot Restrictions

Many sites employ cross-origin protections that block taking screenshots from code on another domain. When taking screenshots on such sites, we may see errors like:

DOMException: Failed to execute 'toDataURL' on 'HTMLCanvasElement': Tainted canvases may not be exported.

To workaround this, we need to run Selenium in the same origin as the site by hosting it on the same domain. For example, hosting Selenium on the site's own servers or infrastructure. Another option is to use proxy tools like browless or Chromeless to take screenshots. These run headless Chrome in the cloud and can bypass cross-origin restrictions.

Finally, some sites may implement screenshot protection via headers like X-Frame-Options. In this case, there is no reliable workaround and screenshots may not be feasible.

Automating and Managing Browser Screenshots

Taking one-off screenshots for debugging is useful, but we often need a more managed approach for things like:

Generating screenshots across multiple tests
Capturing screenshots on test failures
Storing and organizing screenshots
Comparing screenshots across runs
Integrating screenshots with CI/CD systems

Some tips for effectively managing screenshots in test automation:

Create utility wrapper methods for common screenshot operations to standardize screenshot code.
Generate automatic filenames with details like test name, timestamp, browser etc.
Configure screenshots as artifacts in your CI/CD system. Many tools like Jenkins allow archiving test screenshots.
Use a screenshot testing tool like Applitools to organize, compare and manage screenshots.
Store screenshots in cloud storage like S3 buckets to access across different environments.
Track screenshots along with metadata like test results, environment details etc. in your automation reporting dashboard.
Implement screenshot diffing to compare current screenshots against baseline approved images to identify regressions.

Automating screenshot management ensures they provide ongoing value across the entire test process.

Advanced Uses for Selenium Screenshots

Beyond debugging, screenshots open up some additional useful possibilities:

Visual Testing – Screenshots can be used for visual validation to check for broken UI, shifts in layout, styling issues etc. Tools like Applitools allow automating visual UI tests.
Page Layout Testing – Multi-browser screenshots can help validate responsiveness across viewports and identify layout bugs.
Documentation – Screenshots can auto-generate documentation showing application states and workflows.
Monitoring – Screenshots can assist in monitoring and alerting by capturing screenshots of errors, application outages, or performance issues.
A/B Testing – Screenshots allow comparing UI variations in A/B tests to evaluate design changes.
Computer Vision – Screenshots enable applying computer vision for tasks like OCR, text extraction, and image analysis.
Presentation Testing – For highly visual apps, pixel-level screenshot diffs can verify precise rendering of colors, images, graphs etc.
Accessibility Testing – Screenshots can assist in auditing contrast ratios, color palettes, and other accessibility criteria.
Animated GIFs – Taking a rapid sequence of screenshots allows building animated GIFs to document workflows and test cases.

So screenshots are not just limited to debugging – they can also provide rich visual data for many other test automation use cases.

Conclusion

Taking a screenshot with Selenium is a valuable technique for capturing the state of a webpage during automated testing. By leveraging the WebDriver's built-in screenshot functionality, testers can programmatically capture and save visual evidence of their tests, which is crucial for debugging and verifying the UI of web applications.

This process can be implemented with just a few lines of code across various programming languages that support Selenium, making it an accessible and essential tool in the quality assurance process. Whether it's for capturing the entire page or just a specific element, screenshots can provide a fast and accurate visual confirmation that the web application functions as expected across different environments and scenarios.