5 Best Ways to Get the Title and URL of the Page in Selenium with Python

πŸ’‘ Problem Formulation: In web automation and testing using Selenium with Python, retrieving the title and URL of a web page is a common requirement. This article addresses the challenge by demonstrating various methods to obtain the current page title and URL. The input is a Selenium WebDriver instance pointed to a specific page, and the desired outputs are the title and URL of that page.

Method 1: Using WebDriver Properties

This first method involves using the native properties of the Selenium WebDriver in Python. The title and current_url attributes are built into the WebDriver class, which provide the current page’s title and URL respectively.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.example.com")

print("Title:", driver.title)
print("URL:", driver.current_url)

driver.quit()

Output:

Title: Example Domain
URL: http://www.example.com

This code initializes a Selenium WebDriver for Chrome, navigates to “http://www.example.com”, and then prints out the title and current URL of the page. The driver.title accesses the title, and driver.current_url retrieves the full URL. The driver.quit() at the end closes the browser.

Method 2: Executing JavaScript

Another approach is executing JavaScript within the page context using execute_script() function. This can be particularly useful if additional JavaScript operations are needed to determine the title or URL.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.example.com")

title = driver.execute_script("return document.title;")
url = driver.execute_script("return window.location.href;")

print("Title:", title)
print("URL:", url)

driver.quit()

Output:

Title: Example Domain
URL: http://www.example.com

After navigating to the website, JavaScript is executed to get the document’s title and the window’s location URL. This demonstrates a more dynamic approach where browser-side scripts can be utilized for more than just retrieving the title or URL.

Method 3: Using WebDriver Methods

Selenium WebDriver also provides specific methods to get certain information about the current session, such as get_title() and get_current_url(), equivalent to accessing properties.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.example.com")

print("Title:", driver.get_title())
print("URL:", driver.get_current_url())

driver.quit()

Output:

Title: Example Domain
URL: http://www.example.com

In this snippet, get_title() and get_current_url() methods are supposedly available, providing an alternative syntax for retrieving the title and URL. Please note that as of the knowledge cut-off, these methods might not be directly available, and the above attributes should be used instead.

Method 4: Accessing Browser History

For more advanced scenarios, you can potentially leverage the browser history object available in JavaScript to get the current URL.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.example.com")

url_from_history = driver.execute_script("return window.history.state.url;")

print("Current URL from history state:", url_from_history)

driver.quit()

Output:

Current URL from history state: http://www.example.com

This code snippet is an extension of the JavaScript execution method that checks the history state object for the current URL. Note that the history state object may not always contain the URL depending on how the site is managed and the state is pushed into the browser history.

Bonus One-Liner Method 5: Using Python Properties

For an even more concise one-liner, Python’s property syntax can be extended to create custom attributes that retrieve the title and URL using lambda functions.

Here’s an example:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("http://www.example.com")

Title = property(lambda self: self.driver.title)
URL = property(lambda self: self.driver.current_url)

print("Title:", Title.__get__(driver))
print("URL:", URL.__get__(driver))

driver.quit()

Output:

Title: Example Domain
URL: http://www.example.com

By defining custom property descriptors, the example wraps the WebDriver’s attributes within a lambda function that can be accessed in a class that contains the WebDriver instance. The __get__ method is then used to retrieve the values in a property-like manner.

Summary/Discussion

  • Method 1: Direct Attributes. Quick and straightforward. This is the most common and recommended method due to its simplicity. It may not work in cases where JavaScript alters the title or URL after the initial page load.
  • Method 2: JavaScript Execution. Flexible and powerful. Best suited for complex scenarios that require additional JavaScript execution. It depends on JavaScript being enabled in the browser.
  • Method 3: WebDriver Methods. Easy to remember. This approach is hypothetical in standard Selenium usage and conveyed here for the sake of completeness. In practice, users should revert to the direct properties as noted in Method 1.
  • Method 4: Browser History Access. Useful for specific edge cases. This method is not commonly used and may have inconsistent results depending on the application structure and how it manages the browser history.
  • Method 5: Python Properties. Pythonic and elegant one-liner. Rarely needed but shows the extensibility of Python. It may be confusing to newcomers due to the abstraction level.