5 Best Ways to Capture a Screenshot of a Page Element in Selenium with Python

Rate this post

πŸ’‘ Problem Formulation: When using Selenium with Python for web automation or testing, developers often need to capture screenshots of specific elements on a webpage rather than the entire page. For example, they may want to take a snapshot of a login form or a popup notification to validate UI changes or for reporting purposes. This article will walk through several methods to achieve this, focusing on capturing just the desired element and not the entire browser window.

Method 1: Using the element.screenshot() Method

This method involves utilizing the screenshot() function provided by Selenium to capture a screenshot of a specific element. The function takes a filename as a parameter and saves the screenshot of the WebElement to the given location.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://example.com')
element = driver.find_element_by_id('element-id')
element.screenshot('element_screenshot.png')
driver.quit()

The output of this code snippet will be a PNG file named ‘element_screenshot.png’ containing the screenshot of the element with the specified ID on ‘https://example.com’.

This code snippet starts by importing the necessary modules and launching a new browser session. It then navigates to a sample webpage, locates the desired element based on its ID, and captures a screenshot of it. Finally, it quits the browser session.

Method 2: Cropping the Screenshot Using PIL

If you need to adjust or crop the screenshot further, Python’s PIL (Pillow) library can be used in conjunction with Selenium to capture and modify the screenshot as needed.

Here’s an example:

from selenium import webdriver
from PIL import Image

driver = webdriver.Chrome()
driver.get('https://example.com')
element = driver.find_element_by_id('element-id')

# Taking full page screenshot
driver.save_screenshot('full_screenshot.png')

# Getting element location and size
location = element.location
size = element.size

# Opening the full page screenshot and cropping it to the element
x, y = location['x'], location['y']
width, height = size['width'], size['height']
full_screenshot = Image.open('full_screenshot.png')
element_screenshot = full_screenshot.crop((x, y, x+width, y+height))
element_screenshot.save('cropped_element_screenshot.png')

driver.quit()

The output of this code snippet is a cropped PNG file named ‘cropped_element_screenshot.png’ that contains only the screenshot of the desired element.

This snippet takes a full-page screenshot, retrieves the specified element’s location and size, and then uses the Python Imaging Library (PIL) to crop the screenshot to just the region occupied by the element. The resulting image is saved.

Method 3: Using a Custom Screenshot Function

Developers might opt for creating a reusable function to encapsulate the screenshot behavior, making the code cleaner and more maintainable. This function would take a WebDriver and WebElement instance and save the screenshot to a specific path.

Here’s an example:

from selenium import webdriver

def take_element_screenshot(driver, element, file_name):
    element.screenshot(file_name)

driver = webdriver.Chrome()
driver.get('https://example.com')
element = driver.find_element_by_id('element-id')
take_element_screenshot(driver, element, 'element_screenshot.png')
driver.quit()

The output is a PNG file named ‘element_screenshot.png’ just like in Method 1, containing the screenshot of the specified element.

By creating a standalone function to take the screenshot, the code becomes cleaner, and the screenshot logic can be reused across multiple tests or scripts without repeating the same lines of code.

Method 4: Capturing the Screenshot via JavaScript Injection

In some complex scenarios, direct screenshot methods might not work due to overlays or other page elements. In such cases, injecting JavaScript to capture the element’s canvas can be a viable alternative.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get('https://example.com')
element = driver.find_element_by_id('element-id')

# JavaScript to capture a screenshot of the element
script = "var img = new Image(); var canvas = document.createElement('canvas'); document.body.appendChild(canvas); var context = canvas.getContext('2d'); canvas.width = arguments[0].width; canvas.height = arguments[0].height; context.drawImage(img, 0, 0);"
driver.execute_script(script, element)

driver.quit()

This method does not produce an output file directly but prepares the groundwork for capturing an element’s image using JavaScript within the browser context.

The code snippet injects JavaScript into the current page to create an image and canvas element, which are used to draw the desired element. Although additional steps are required to actually save the screenshot, this approach can bypass certain limitations of traditional screenshot methods.

Bonus One-Liner Method 5: Using get_screenshot_as_png() and BytesIO

For a quick capture without saving to disk, the get_screenshot_as_png() method can be used with a BytesIO object to capture a screenshot directly into memory.

Here’s an example:

from selenium import webdriver
from io import BytesIO
from PIL import Image

driver = webdriver.Chrome()
driver.get('https://example.com')
element = driver.find_element_by_id('element-id')
png = element.get_screenshot_as_png()  # Capture the element screenshot as a binary data
screenshot = Image.open(BytesIO(png))  # Create an Image object from the binary data
screenshot.show()  # Display the image for demonstration purposes

driver.quit()

Instead of a file, you get a Python Image Library (PIL) Image object displayed on the screen, containing the screenshot of the element.

This succinct code captures the screenshot of the element to memory, converts it to an image object using PIL, and then displays it. It’s useful for instances where you don’t need to save the image to disk and just need to manipulate or view it in memory.

Summary/Discussion

  • Method 1: element.screenshot() Method. Straightforward and direct. Limited to Selenium’s built-in capabilities. May not work with complex page layouts.
  • Method 2: Cropping with PIL. Flexible and powerful. Allows fine control over the screenshot but requires additional PIL library and more code.
  • Method 3: Reusable Custom Function. Makes code reusable and maintainable. Relies on the capabilities of Selenium’s base method.
  • Method 4: JavaScript Injection. Advanced method that can overcome some of the limitations of other methods. Requires deeper understanding of JavaScript and browser APIs.
  • Bonus Method 5: get_screenshot_as_png() and BytesIO. Quick and to memory. Useful for on-the-fly image captures without needing permanent storage. Does not actually save the image unless additional steps are taken.