Learn Python and Selenium test

A quick introduction in using Python with Selenium WebDriver

By Jochen D.

Why Developers Prefer Python for Selenium Test Scripts

Selenium supports multiple languages (Java, C#, Ruby, JavaScript and more), but Python has become a favorite for many test automation developers. There are several reasons why Python is often preferred for writing Selenium scripts:

  • Simple and readable syntax:

    Python's syntax is concise and easy to understand, using common English keywords and minimal boilerplate. This makes Selenium tests in Python easier to write and maintain compared to more verbose languages like Java. Test scripts in Python tend to be shorter and more readable, allowing testers to focus on test logic rather than complex syntax.

  • Rapid development cycle:

    Python is interpreted (no compilation step), which means you can write and run tests faster. The quick edit-run feedback loop speeds up automation development and debugging. Because you don't need to compile, you can iterate on test scripts quickly, making changes and re-running tests in seconds.

  • Large ecosystem and community:

    Python has a vast community and a rich set of libraries. Many testers are already familiar with Python and new testers face a low learning curve. You can leverage numerous packages for things like data handling, XML/JSON processing, database connections, etc in your tests. The active community also offers plenty of tutorials, documentation and help when writing Selenium tests in Python.

  • Robust Selenium bindings:

    The Selenium WebDriver API in Python is very intuitive and well-designed. The Python Selenium bindings offer a simple and convenient API to drive (remote) browsers. For example; starting a browser and locating elements in Python often requires less code than equivalent tasks in some other languages. This simplicity does not come at the cost of functionality: the Python bindings support all core Selenium features.

  • Multiple testing frameworks:

    Python supports a variety of testing frameworks and tools. The built-in unittest (PyUnit) is modeled after JUnit, and powerful third-party frameworks like pytest and behave (for BDD) are available. This flexibility allows teams to choose a framework that fits their style. pytest is extremely popular for Selenium testing because of its simple syntax, fixtures and plugin ecosystem. Few languages offer the breadth of test frameworks that Python does, ranging from unit test frameworks to BDD and data-driven testing frameworks.

  • Easy integration and extensibility:

    Python's dynamic nature and rich library support make it easy to integrate Selenium tests with other tools. You can easily parse CSV/Excel files for test data, send HTTP requests or integrate with APIs within your test code. Many CI/CD systems (Jenkins, GitHub Actions, ...) have strong support for Python, making it straightforward to run Selenium Python tests in pipelines.

In summary; Python's ease of use, readability and strong ecosystem allow developers to create Selenium test scripts efficiently. These factors make it a compelling choice for fast, maintainable test automation projects. Whether you are a seasoned developer or a tester with minimal programming background, Python lowers the barrier to writing effective Selenium tests.

Getting Started with Selenium in Python

Before writing Selenium tests in Python, you need to set up your environment. Follow these steps to get started:

  • Install Python (if not already installed):

    Ensure you have Python 3 installed on your system (Python 2 usage is discouraged). You can verify by running python --version in a terminal. Most modern OSes have it by default, or you can download it from the official Python site.

  • Install the Selenium Python library:

    Selenium is available as a Python package. Use pip to install it:

    Copy code
    pip install -U selenium

    This command will download and install the latest Selenium bindings for Python. It's recommended to do this inside a virtual environment (uvenv for example) to avoid package conflicts. Once installed, you can verify by importing Selenium in a Python REPL.

  • Install a web driver (if needed):

    Selenium works by controlling a browser through a browser-specific driver. For example to automate Chrome, you need the ChromeDriver executable; for Firefox, the GeckoDriver, etc. In the past, you had to manually download these drivers and make sure they were in your system's PATH. However, with modern Selenium (v4 and above), the new Selenium Manager will automatically download the appropriate driver the first time you instantiate a browser if it's not found. This simplifies setup: you usually don't need to manually manage driver binaries now. (If you're using an older Selenium version or prefer manual setup, download the driver for your browser and place it in your PATH or specify its path in code.)

  • Choose a target browser:

    Selenium supports all major browsers: Chrome, Firefox, Edge, Safari, etc. You should have the browser installed that you want to automate. For Chrome and Edge, ensure they are up-to-date to match the driver (Selenium Manager will fetch a matching driver). Safari comes with a built-in driver on macOS (enabled via Safari's developer settings). For this example, we will use Chrome or Firefox, but you can use any supported browser.

Once the above is in place, you can start writing Selenium scripts. A quick smoke test to confirm everything is set up: open a Python shell and try the following:

Copy code
from selenium import webdriver
>>> driver = webdriver.Chrome()  # or webdriver.Firefox()
>>> driver.get("https://www.python.org")
>>> print(driver.title)
>>> driver.quit()

This should launch a browser, navigate to the Python homepage, print the page title (e.g., "Welcome to Python.org") and then close the browser. The call to driver.get() will wait until the page is fully loaded before returning. If this works without errors, your environment is ready for automation. In the next section, we'll create a proper test script using Selenium in Python.

Writing and Running Your First Selenium Python Test

Let's walk through creating and executing a simple Selenium test script in Python. This example will open a browser, navigate to a website, perform a search and verify the results. Finally, we will close the browser. Here's a step-by-step breakdown:

  1. Import the necessary modules:

    At minimum, you need the Selenium webdriver module. You may also import some convenience classes like By for locating elements and Keys for simulating keyboard input.

    Copy code
    from selenium import webdriver
    from selenium.webdriver.common.by import By
    from selenium.webdriver.common.keys import Keys

    The webdriver module provides the browser driver classes (Chrome, Firefox, etc.), By contains location strategies, and Keys lets you send special keystrokes (like pressing Enter).

  2. Create a WebDriver instance (launch the browser):

    Instantiate a browser driver. For example, to open Chrome:

    Copy code
    driver = webdriver.Chrome()

    This line starts a new Chrome browser window under Selenium's control. If this is the first time running and ChromeDriver isn't found, Selenium Manager will download the appropriate driver automatically. You can similarly use webdriver.Firefox() for Firefox, webdriver.Edge() for Edge, etc. Upon execution, an empty browser window should pop up.

  3. Navigate to a webpage:

    Use the driver.get(url) method to load a webpage. For our test, we'll use Python's official site:

    Copy code
    driver.get("https://www.python.org")

    This will command the browser to load the given URL. Selenium will wait until the page's HTML is fully loaded (the onload event fires) before moving to the next step. After this line, you should see the Python homepage in the browser.

  4. Locate a web element on the page:

    Most tests involve interacting with page elements (like links, buttons, text fields). To do so, we must find the element in the DOM. The Python homepage has a search bar element. We can locate it by its name attribute, which is "q". Selenium provides find_element() for this:

    Copy code
    search_box = driver.find_element(By.NAME, "q")

    Here we used By.NAME locator strategy to find the first element with name="q". (This is equivalent to the older find_element_by_name("q") which is now deprecated in the latest Selenium versions in favor of the By approach.) If the element is found, a WebElement object is returned and stored in search_box. If not, Selenium will throw an exception. (We could add an explicit wait here in case the element takes time to appear, but on this page it's immediate.)

  5. Interact with the element:

    Now that we have the search box, we want to perform a search. Common interactions include typing text into input fields and clicking buttons. First, it's good practice to clear any pre-filled text in the input (the Python.org search box might have a placeholder):

    Copy code
    search_box.clear()  # clear any existing text

    Then send keystrokes to it:

    Copy code
    search_box.send_keys("getting started with python")

    This simulates typing the query into the search box. Finally, we need to submit the search. We can either find and click the search button or simply press Enter in the text box. We'll do the latter using Keys.RETURN:

    Copy code
    search_box.send_keys(Keys.RETURN)

    This sends an "Enter" key to the field, causing the form to submit. After this step, the browser will navigate to the search results page.

  6. Verify the result:

    Once the results page loads, we should check that our search worked. One simple validation is to ensure the title of the page contains the search query. For instance:

    Copy code
    assert "Search" in driver.title

    (On Python.org, after searching, the title changes to something containing "Search".) In a real test, you might want to verify that certain expected results appear on the page. You could locate a result element and check its text. For brevity, we'll stick with a title check. Using an assert will raise an AssertionError if the condition is false, which in testing frameworks marks the test as failed. (On Python.org's search, you could also assert that "No results found." is not present in the page source as an indicator of success.)

  7. Close the browser:

    After the test actions and validations, close the browser to clean up. You can call:

    Copy code
    driver.quit()

    This will shut down the entire browser session, closing all tabs and windows opened by the test. Always quitting ensures no browser processes are left running in the background. In this simple script, we call it at the end of the script. In a larger test suite, it's often done in a teardown method or fixture (more on that later). Note: Selenium also has driver.close() which closes the current window only – but if only one window is open, it effectively quits the browser. We will use quit() to be sure everything is closed.

Now let's put it all together. Below is the complete Python script for our first test. You can save this to a file (e.g., first_test.py) and run it with python first_test.py:

Copy code
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

# 1. Launch browser (Chrome in this example)
driver = webdriver.Chrome()

# 2. Navigate to Python.org
driver.get("https://www.python.org")
assert "Python" in driver.title  # simple check that we're on the right page

# 3. Perform a search for "getting started with python"
search_box = driver.find_element(By.NAME, "q")
search_box.clear()
search_box.send_keys("getting started with python")
search_box.send_keys(Keys.RETURN)

# 4. Verify search results page loaded
assert "results" in driver.title.lower()

# 5. Print the current URL (just as an example of getting info)
print("Current page URL:", driver.current_url)

# 6. Make sure to close the browser
driver.quit()

When you run this script you should see the Chrome browser open and automate the search on python.org. The script will print the current URL (which will include your query parameters) to the console. All assertions should pass if the page loaded correctly. Eventually, the browser will close. You have successfully written and executed your first Selenium test script in Python!

Interacting with Common Web Elements (Buttons, Input Fields, etc.)

Web pages consist of various elements like buttons, links, text fields, checkboxes, dropdown menus, and so on. Selenium WebDriver provides a straightforward interface to interact with these elements once you have located them. Here are some common element interactions and how to perform them in Selenium Python:

  • Clicking buttons and links:

    To click a button or link (or any clickable element), first locate the element (e.g., via By.ID, By.XPATH, etc.), then call its click() method. For example:

    Copy code
    login_button = driver.find_element(By.ID, "login-btn")
    login_button.click()  # simulate a mouse click

    This will click the element just as a user would. Use this for buttons, links (anchor tags), radio buttons, checkboxes (they toggle on click) and other clickable controls.

  • Typing into input fields:

    As seen in the earlier example, you use send_keys() to simulate typing. This works for text fields, text areas, or any element that accepts keyboard input. Always consider using .clear() before send_keys() if the field might contain pre-existing text (like placeholders or old input). Example:

    Copy code
    username_field = driver.find_element(By.NAME, "username")
    username_field.clear()
    username_field.send_keys("my_username")

    You can send not only regular text but also special keys like TAB, ENTER, etc., using the Keys class (e.g., username_field.send_keys(Keys.TAB) to move focus).

  • Handling checkboxes and radio buttons:

    Checkboxes and radios are typically toggled by clicking. Use click() to select or unselect them. It's often useful to check state first:

    Copy code
    agree_checkbox = driver.find_element(By.ID, "agree")
    if not agree_checkbox.is_selected():
        agree_checkbox.click()

    The is_selected() method returns a boolean indicating if the checkbox or radio button is currently checked. The above ensures we only click it if it's not already selected. For radio buttons, usually you just click the one you want (which automatically unselects any other in the same group).

  • Selecting options from dropdowns:

    HTML <select> dropdowns can be manipulated by a special helper class in Selenium. Import Select from selenium.webdriver.support.ui, and initialize it with the dropdown element:

    Copy code
    from selenium.webdriver.support.ui import Select
    country_select = Select(driver.find_element(By.ID, "country"))

    Now you can select by visible text, value, or index:

    Copy code
    country_select.select_by_visible_text("Canada")
    country_select.select_by_value("CA")
    country_select.select_by_index(2)

    The above will choose the option with the text "Canada" from the dropdown. Selenium's Select class also provides methods to deselect (for multi-select lists) and to get all options, etc. If a dropdown isn't a standard <select> (some web UIs use custom styled divs for dropdowns), you may need to click the element to open the list and then click the desired option like a regular element.

  • Reading element text and attributes:

    To verify outcomes, you often need to retrieve content from the page. Use the text property of a WebElement to get the visible text inside an element. For example:

    Copy code
    welcome_msg = driver.find_element(By.ID, "welcome-banner").text
    print("Banner says:", welcome_msg)

    If the element is something like <div id="welcome-banner">Hello User</div>, the above will print "Hello User". For input fields, element.text may be empty (since inputs don't have inner text). In such cases, retrieve the value attribute: element.get_attribute("value"). Similarly, get_attribute() lets you read any attribute value (e.g., href of a link, src of an image, etc.). This is useful for validations (for example, confirming a certain checkbox's "checked" attribute or a form field's value).

These are the basics of interacting with web elements. In practice, you will chain these actions to navigate through workflows: clicking navigation links, filling forms, submitting them, etc. Selenium will faithfully execute these actions in the browser. Always ensure that the element is present and visible before interacting; if not, you may get exceptions (for example, trying to click a button that is not yet in the DOM or is hidden will cause an error). This is where "waits become important.

Finally, remember that all interactions are subject to the browser's state. If an element is disabled (e.g., a button grayed out), click() might not have effect; your script may need to meet whatever conditions enable the element. The Selenium API also provides advanced user interactions (like double-click, drag-and-drop, hover, etc.) via the ActionChains class, but those are beyond the scope of common element interactions.

Navigating HTML DOM Elements

Finding elements is a critical part of using Selenium. The library offers multiple locator strategies to navigate the DOM (Document Object Model) and pinpoint elements. Here's a quick overview of locator strategies and DOM navigation techniques:

  • By ID:

    If an element has a unique id attribute, this is often the best way to find it. IDs are supposed to be unique on a page, making them a reliable locator. Example: driver.find_element(By.ID, "submit-btn").

  • By Name:

    Uses the name attribute of the element. Common for form fields. Example: driver.find_element(By.NAME, "q"). (Note: if multiple elements share the same name, this will return the first match.)

  • By Class Name:

    Uses the CSS class attribute. Example: driver.find_element(By.CLASS_NAME, "product-title"). This is useful if the element's class is unique, but be cautious: many elements might share the same class, so you might get the first one unexpectedly.

  • By Tag Name:

    Locates elements by their HTML tag (e.g., "h1", "p", "a"). This is not commonly used alone unless you want, say, the first <h1> on the page. More often, it's used in combination (like find a specific container, then find elements by tag within it).

  • By Link Text / Partial Link Text:

    These strategies are specifically for anchor (<a>) tags. By.LINK_TEXT finds a link with exact visible text, and By.PARTIAL_LINK_TEXT finds links that contain the given text. Example: driver.find_element(By.LINK_TEXT, "Sign up") would find <a>Sign up</a> links.

  • By CSS Selector:

    This is a powerful locator using CSS selectors (the same syntax used in stylesheets or JavaScript querySelector). Example: driver.find_element(By.CSS_SELECTOR, "div.content > ul#items li.item"). CSS selectors can target elements by class, ID, attributes, hierarchy, etc. They are often more concise than XPath and can be faster. If you know CSS, this is a very handy way to locate elements.

  • By XPath:

    XPath is a query language for XML/HTML structure. It can locate elements via complex tree relationships. Example: driver.find_element(By.XPATH, "//div[@id='items']/ul/li[1]") to get the first <li> under a <div id="items">. XPath can do things like navigate to parent (..), find by text content, and more. It's extremely powerful but can be verbose, and XPaths that are too tied to page structure might break if the UI changes. Use it when other strategies aren't sufficient.

Often you will use a combination of these. A good practice is to start with the simplest locator that uniquely identifies your element. For instance, prefer an ID or a specific CSS selector if available, as those tend to be stable. Avoid very brittle locators like absolute XPath that depend on specific DOM structure. The more stable your locators, the less often your tests break due to minor UI changes. Choosing the right locator minimizes test maintenance when the application UI changes.

Selenium also provides find_elements(...) (notice the plural) which returns a list of all matching elements, instead of just the first. Use this when you expect multiple matches and need to work with all of them. For example:

Copy code
rows = driver.find_elements(By.CSS_SELECTOR, "table#results tr")
print("Result count:", len(rows))

This might find multiple table rows. You can iterate over rows list to inspect each, or access specific ones by index. If find_elements finds nothing, it returns an empty list (whereas find_element would throw an exception if nothing is found).

Another technique for DOM navigation is to first find a parent (container) element, then search within it. Any WebElement can itself serve as the context for a find. For example:

Copy code
form = driver.find_element(By.TAG_NAME, "form")
submit_btn = form.find_element(By.TAG_NAME, "button")

Here we found a form element, then searched for a <button> inside that form. This scoping can make locators more specific and avoid picking up elements from elsewhere on the page.

In complex pages, you might have to navigate through parent-child-sibling relationships. While Selenium doesn't provide explicit parent/sibling navigation methods, you can always use a suitable XPath if needed (e.g., element.find_element(By.XPATH, "..") to get the parent of a given element, or using axes in XPath to get siblings).

Selenium offers flexible ways to locate elements in the DOM. Mastering CSS selectors and XPaths can be very helpful for tricky situations. Start simple, and only use complex locators when necessary.

Handling Waits in Selenium (Implicit and Explicit)

Web pages do not always load instantly or behave predictably. Elements might appear after a delay (due to network latency or asynchronous JavaScript/AJAX loading content). If your Selenium script runs too fast, it may try to interact with an element that isn't yet present or ready, leading to errors. To handle this, Selenium provides wait mechanisms. There are two main types of waits in Selenium Python: implicit waits and explicit waits.

Implicit Wait: An implicit wait tells the WebDriver to poll the DOM for a certain amount of time when trying to find elements, before throwing an exception. You set it once, and it applies to all element searches. For example:

Copy code
driver = webdriver.Chrome()
driver.implicitly_wait(10)  # seconds

After this, any find_element call will wait up to 10 seconds for the element to appear if it's not immediately available. Selenium polls the page periodically (every 0.5 seconds by default) until the element is found or time runs out. If the element is found sooner, the script continues immediately, so you're not always paying the full wait time. Implicit waits are a simple way to ensure your script doesn't fail just because something wasn't instantly available.

Explicit Wait: An explicit wait is a more powerful, condition-based wait. You define a specific condition to wait for, and WebDriver will wait until that condition is satisfied (or timeout expires). Explicit waits are implemented with the WebDriverWait class in combination with ExpectedConditions (in Python, these are in selenium.webdriver.support.expected_conditions as EC). For example, to wait up to 15 seconds for an element with ID "result" to become visible:

Copy code
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 15).until(
    EC.visibility_of_element_located((By.ID, "result"))
)

This will pause execution until the condition is met; in this case, until the element is found in the DOM and is visible on the page. If the condition is met before 15 seconds, the wait unblocks and the script continues. If 15 seconds pass without the condition being true, a TimeoutException is thrown.

Selenium comes with a set of common expected conditions you can use, such as:

  • presence_of_element_located element is in the DOM (may or may not be visible).
  • visibility_of_element_located element is in DOM and visible (width & height > 0).
  • element_to_be_clickable element is visible and enabled such that you can click it.
  • text_to_be_present_in_element given text appears in an element's text.
  • alert_is_present an alert dialog is present (useful before switching to alert).

Using explicit waits thus allows you to wait for specific states or conditions, which is very powerful for handling dynamic content.

Example of Explicit Wait: Suppose after clicking a "Submit" button, a result message appears on the page but could take a few seconds. We can do:

Copy code
submit_btn = driver.find_element(By.ID, "submit")
submit_btn.click()

# Wait for the result message to be visible
message = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located((By.ID, "result-msg"))
)
print("Result message text:", message.text)

In this snippet, after clicking, we wait up to 10 seconds for an element with id "result-msg" to become visible. Only then do we print its text. This ensures we don't attempt to read text from an element that isn't there yet.

Avoiding sleep: A bad practice is to use hard-coded delays like time.sleep(5). This will always wait 5 seconds whether or not it was necessary, slowing down tests unnecessarily and still potentially being too short or too long in some cases. Implicit and explicit waits are smarter: waiting only as long as needed up to the timeout, and no longer. They make tests more efficient and reliable. In general, prefer using waits over fixed sleeps for synchronization in Selenium tests.

A note on mixing waits: You can use both implicit and explicit waits in the same script, but be cautious. Selenium's documentation notes that mixing them can sometimes cause unexpected wait times. For example, if an implicit wait is set to 10s and you also use an explicit wait of 15s for a condition, the waits could potentially overlap and compound. In practice, it's often best to use explicit waits for fine-grained conditions and keep implicit waits either off or to a modest baseline (like 5-10 seconds) for all finds. Many experts actually set implicit wait to 0 and rely solely on explicit waits where needed, to avoid any ambiguity. The approach depends on your testing needs.

In summary; use implicit waits to globally set a fallback wait for element searches, and explicit waits to wait for specific things to happen (an element to be clickable, a title to change, etc.). Proper use of waits will make your Selenium tests resilient to minor performance variations and ensure they don't fail just because something was a second late to load. It's an essential part of reliable test automation.

Handling Alerts and Pop-ups

Web applications may present modal dialog boxes: for example, an alert saying "Are you sure you want to delete?" or a prompt asking for input. These are browser native dialogs triggered by JavaScript functions like alert(), confirm(), or prompt(). In Selenium, such dialogs are handled via the Alert interface.

When an alert pops up, it blocks interaction with the page until dismissed. Selenium provides driver.switch_to.alert to access the alert. This does not switch windows (alerts aren't windows); it gives you an alert object representing the open dialog.

Typical usage:

Copy code
alert = driver.switch_to.alert

Once you have this alert object, you can:

  • Read the alert text: alert.text will contain the message shown in the alert.
  • Accept the alert: alert.accept() corresponds to clicking "OK" or "Yes".
  • Dismiss the alert: alert.dismiss() corresponds to clicking "Cancel" or closing the alert via [X]. (For simple alerts, accept and dismiss do the same, but for confirmation dialogs, accept vs. dismiss matters.)
  • Send input to a prompt: If the alert is a prompt (an input dialog), you can use alert.send_keys("some text") to enter text into the prompt, then usually call accept() to submit it.

Here's an example workflow:

Copy code
# Assume clicking delete triggers a confirmation dialog
delete_button = driver.find_element(By.ID, "delete")
delete_button.click()

# Switch to the alert
alert = driver.switch_to.alert
print("Alert says:", alert.text)  # print the alert message
alert.accept()  # click OK on the alert

In the case of a confirmation, accept() typically means "Yes, proceed" whereas dismiss() means "Cancel". For a simple information alert, accept() just closes it (since there is usually only an OK button). For a prompt, you might do:

Copy code
alert = driver.switch_to.alert
alert.send_keys("John Doe")
alert.accept()

This enters text into the prompt and presses OK.

Once an alert is handled (accepted/dismissed), Selenium will resume normal operation on the page. If an alert was expected to appear but didn't, calling driver.switch_to.alert will throw an exception (NoAlertPresentException). Conversely, if an alert is open and you try to do any other driver.find_element without handling the alert, Selenium will raise an UnhandledAlertException. Thus, your script should handle alerts promptly when they are expected.

Pop-up windows vs alerts: Note that here we're talking about browser dialogs. If by "pop-up" we mean a new browser window or an HTML-based modal (like a bootstrap modal dialog), those are handled differently:

  • A new browser window is handled via switch_to.window (as discussed in the windows section).
  • An HTML modal that's part of the DOM is not an alert; it's just a regular element (perhaps with a backdrop). You would interact with it like any other element (maybe it's a <div> that becomes visible). For such modals, you might have to wait until they are visible and then click buttons within them.

But for true browser alerts/confirmation dialogs, switch_to.alert is the way. Always call either .accept() or .dismiss() on the alert to close it (or .send_keys() first if needed). For test verification, you might also assert the alert.text to ensure the message is what you expect.

Example use case: After submitting a form, the site might show "Thank you!" in an alert box. Your test should handle it like:

Copy code
alert = driver.switch_to.alert
assert "Thank you" in alert.text
alert.accept()

This verifies the alert content and closes it.

Selenium's alert handling works uniformly across browsers. Just be mindful that alerts are modal. By managing them as shown, you can automate tests that involve confirmation steps or alert messages without issues. Once handled, you can continue with the rest of your test script seamlessly.

Cleanup and Teardown in Selenium Scripts

Proper cleanup of resources is an important part of any test script. For Selenium tests, this primarily means making sure the browser is closed at the end of each test run. Failing to close the browser can lead to many orphan browser instances consuming memory and affecting subsequent tests or system performance.

Closing vs Quitting: Selenium provides two methods: close() and quit().

  • driver.quit() Quits the entire browser session, closing all windows/tabs opened by the WebDriver. This ends the WebDriver session gracefully.
  • driver.close() Closes the current window. If it's the last window, some browsers may end the session, but not always reliably if multiple windows were open. Essentially, close() is like clicking the "X" on the current window, whereas quit() will terminate the browser process entirely.

Best practice is to use quit() in test teardown to ensure no stray processes remain. If you only ever open one window per test, close() and quit() have similar effect, but using quit() explicitly communicates that you want the browser gone.

Teardown in scripts: In a simple script, you can just call driver.quit() at the end as we did in the first test example. If your script might raise exceptions or errors before the end, it's wise to ensure quit() still gets called. This is often done with a try/finally block:

Copy code
driver = webdriver.Chrome()
try:
    # ... do test steps
    pass
finally:
    driver.quit()

By placing driver.quit() in the finally clause, you guarantee it runs regardless of whether the test steps succeeded or an error occurred.

Teardown in unittest/pytest frameworks: Test frameworks provide hooks for setup and teardown.

  • In unittest, you can implement the tearDown(self) method in your TestCase class. This method is run after each test method. Typically, you would put self.driver.quit() here to close the browser no matter if the test passed or failed.
  • In pytest, if you use a fixture for the WebDriver, you would quit in the teardown part of the fixture (as shown in the pytest section below). If you use the pytest-selenium plugin's built-in fixture, it will handle the teardown for you automatically.

Other cleanup: Aside from the browser itself, consider any other cleanup. For example, if your test created or downloaded files, you may want to delete them. Or if your test added some data into the system (via the UI), you might need to reset the state (though ideally test environment should be isolated or automatically cleaned between runs). Those aspects depend on the application under test. From Selenium's perspective, the main resource to manage is the WebDriver instance (the browser).

Also, if you start any other related services (like maybe a mock server or something) as part of your test, ensure those are torn down too. In many UI testing scenarios this might not apply, but it's something to keep in mind.

Session timeouts: If you forget to quit drivers, they might linger until the session times out on the server side. For local browsers, that means the browser stays open indefinitely. For cloud/grid providers, there might be a default timeout that eventually closes it. Relying on that is not a good idea; always quit explicitly to free resources immediately.

Example (unittest teardown):

Copy code
class MyTest(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Chrome()
    
    def test_something(self):
        self.driver.get("http://example.com")
        # ... test steps ...
    
    def tearDown(self):
        self.driver.quit()

This ensures each test opens a fresh browser and closes it on completion or error.

In summary, always clean up your Selenium sessions. It makes your tests more reliable (no interference between tests from leftover windows) and conserves system resources. The overhead of launching a new browser for each test might seem high, but it's the safest approach for test isolation. And if you have a lot of tests, you can mitigate the time cost by running tests in parallel (discussed later). Cleanup and teardown is an integral part of robust test automation.

Integrating Selenium with pytest

pytest is a very popular testing framework in Python, known for its simple syntax and powerful features like fixtures and plugins. Selenium tests can be written in pytest in a more compact style than unittest. Let's explore how to use Selenium with pytest.

One of the most useful features of pytest is fixtures. Fixtures are functions that setup some state and then yield or return it for use in tests, and can include teardown code as well. We can use a fixture to manage the WebDriver setup and teardown.

Here's an example of a Selenium test using pytest with a fixture:

Copy code
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By

@pytest.fixture(scope="function")
def driver():
    # Setup: start browser
    driver = webdriver.Chrome()
    driver.implicitly_wait(5)
    yield driver
    # Teardown: quit browser
    driver.quit()

def test_google_search(driver):
    driver.get("https://www.google.com")
    assert "Google" in driver.title
    
    search_box = driver.find_element(By.NAME, "q")
    search_box.send_keys("pytest selenium")
    search_box.submit()
    
    # Example assertion: the results page title contains our query
    assert "pytest selenium" in driver.title.lower()

Explanation:

  • We defined a fixture driver() with @pytest.fixture. The scope="function" means it will run for each test function. In the fixture, we instantiate the Chrome WebDriver and set an implicit wait. The yield driver line returns the driver to the test, and then after the test function finishes, execution resumes after the yield, where we call driver.quit() to clean up. This is a typical pattern for managing resources in pytest fixtures.
  • The test function test_google_search has an argument driver. This tells pytest to use the driver fixture and pass its result into the test. Pytest will automatically call the fixture before the test, give the test the WebDriver, and then do the teardown after the test.
  • Inside the test, we use the driver just as we did before. We assert that "Google" is in the title for the homepage, then perform a search and assert the title of the result page contains our search terms.
  • We used plain assert statements. In pytest, a failing assert will show an introspection of the values, making it easy to see what went wrong.

To run this test, you simply execute pytest in the terminal (or pytest -v for verbose). Pytest will auto-discover any functions starting with test_. It will set up the driver fixture for the test_google_search function, run the test, then tear down the fixture.

Pytest's approach can be more concise than unittest. We don't need to create classes or have the boilerplate of setUp/tearDown methods (though pytest can run unittest TestCase classes too, but here we're using its function style). This often results in less code and clearly separated setup logic (in fixtures) versus test logic.

Advantages of pytest for Selenium:

  • Fixtures: Reusable setup/teardown code. You can define a fixture for driver as above, and potentially have variants (e.g., a fixture that starts Chrome vs one for Firefox, or a fixture that logs in before yielding the driver, etc.). Fixtures can also be parameterized (to easily run the same test on multiple browsers, for instance).
  • Assertion introspection: As mentioned, assert in pytest is very informative on failure. No need to call specific self.assert* methods.
  • Test discovery and organization: Pytest will find any test functions (or methods in classes named Test*) automatically. You can simply write tests without needing a main block. Pytest also has rich options for selecting tests, skipping, etc.
  • Plugins: There are many plugins, including pytest-html for reports, pytest-xdist for running tests in parallel, and pytest-selenium which offers Selenium-specific features like a built-in driver fixture.

Parallel execution: Pytest can execute tests in parallel (with the help of the pytest-xdist plugin). For example, pytest -n 4 will run tests across 4 processes. If your tests are written to be independent (which they should be), this can speed up your suite significantly. When using Selenium against a local browser, parallel execution might require multiple browser instances or setting up Selenium Grid. But if you use a cloud service like TestingBot or others, they handle the parallelism in their cloud. Pytest will simply launch multiple tests which each invoke the driver fixture – if that fixture is set to use a remote WebDriver, tests will run concurrently in the cloud.

pytest-selenium plugin: It's worth noting there's a plugin pytest-selenium that can simplify Selenium usage. For example, it provides a pre-defined selenium fixture so you don't have to write one, and it allows specifying browser via command-line options (e.g., --driver Chrome or --driver Firefox) and remote servers like TestingBot or Sauce via options. Using this plugin, your test could just have def test_example(selenium): and use the provided driver. However, using the plugin is optional – many people write their own fixtures as we did above for full control.

Parameterizing tests: Pytest allows parameterizing tests, which can be useful for Selenium, e.g., running the same test function for multiple user roles or multiple browsers. A simple example:

Copy code
@pytest.mark.parametrize("query", ["Selenium", "pytest", "TestingBot"])
def test_google_search_multiple(driver, query):
    driver.get("https://www.google.com")
    search_box = driver.find_element(By.NAME, "q")
    search_box.send_keys(query)
    search_box.submit()
    assert query.lower() in driver.title.lower()

This would run the test three times, with query values "Selenium", "pytest", and "TestingBot". Parameterization combined with a fixture that can choose browsers could let you cover a matrix of scenarios easily.

In summary, integrating Selenium with pytest can lead to more concise and powerful test code. The use of fixtures for WebDriver management is a clean way to ensure setup/teardown, and pytest's features (assertions, plugins, parameterization) provide a lot of flexibility for scaling your test suite. Both unittest and pytest are viable; choosing between them often comes down to team preference and specific needs. Pytest tends to be favored for larger projects due to its flexibility and wealth of plugins (for example, integrating with TestingBot as we'll see next, or generating reports, etc.).

Best Practices for Using Selenium with Python

After understanding the basics, it's important to follow best practices to make your Selenium test automation robust, maintainable, and efficient. Here are some key best practices for Selenium with Python:

  • Avoid hard-coded sleeps; use waits effectively:

    Instead of using time.sleep() for arbitrary delays (which can slow tests or be insufficient), use implicit or explicit waits to synchronize with the page's behavior. This ensures your test waits just the right amount of time. Fixed sleeps are brittle and should generally be avoided in favor of Selenium's wait mechanisms.

  • Choose stable locators (use IDs or reliable attributes):

    When locating elements, prefer unique IDs or stable selectors. Avoid overly fragile locators like absolute XPath that depend on specific DOM structure. If IDs aren't available, consider using data-* attributes that developers add specifically for testing hooks. The more stable your locators, the less often your tests break due to minor UI changes.

  • Use the Page Object Model (POM) design pattern:

    As your test suite grows, consider structuring your code using Page Objects. In POM, you create classes representing pages or components, encapsulating locator definitions and interactions in those classes. Your test then calls methods on page objects (like LoginPage.login(username, password)) instead of directly using find_element in every test. This reduces duplication and makes tests more readable and maintainable. POM is a widely recommended best practice for Selenium automation for larger projects.

  • Keep tests independent and idempotent:

    Each test should be able to run on its own and not depend on the state left by another test. This means if Test A needs a user to be logged in, it should handle that itself (or via a setup fixture), not rely on Test B having logged in beforehand. Isolation enables you to run tests in any order or in parallel without flakiness. Reset the application state between tests if necessary (e.g., by using test accounts or resetting test data in the backend).

  • Use fixtures/setup wisely for repetitive actions:

    If every test needs to do certain things (like launch the browser, log into the application), factor that into a setup method or pytest fixture. This avoids code duplication and makes maintenance easier. However, avoid doing too much in a shared setup if not all tests need it, as that can unnecessarily slow down tests that don't require that setup.

  • Maximize test speed with parallel execution and selective testing:

    As you accumulate many tests, execution time can grow. Use parallel execution to run tests concurrently (for example, pytest with -n option, or multiple CI jobs). Design tests to run in parallel by avoiding shared state. Additionally, organize tests and use markers/tags so you can run a subset easily (e.g., smoke tests vs full regression) as needed.

  • Run tests on multiple browsers/environments:

    A big benefit of Selenium is cross-browser testing. Take advantage of this by running your test suite against different browser types and versions (Chrome, Firefox, Edge, Safari) and, if needed, different screen resolutions or mobile emulations. You can integrate with cloud services (like TestingBot, BrowserStack, Sauce Labs, etc.) to access many browser/OS combinations. This helps catch compatibility issues early.

  • Leverage headless mode for CI/CD:

    For running tests in continuous integration (where no GUI is present), use headless mode for browsers (Chrome and Firefox support headless mode via options). Headless browsers run without a visible UI and are typically faster and use less resources. Use them for CI runs to speed up and avoid needing a display. But also periodically run with real browsers to ensure the headless behavior matches real user experience.

  • Capture logs and screenshots on failure:

    When a test fails, it helps to have information for debugging. Implement hooks to take a screenshot of the browser page on failure (most frameworks allow hooking into test failures). Selenium's driver.save_screenshot("name.png") can save the current view. Also consider capturing browser console logs or page HTML for analysis. Some cloud grids like TestingBot automatically capture screenshots, videos, and logs for you. In any case, having these artifacts can dramatically speed up diagnosing test failures.

  • Regularly update Selenium and drivers:

    Keep your Selenium library and browser drivers up to date. Browsers frequently update, and newer Selenium versions contain fixes and improvements (like the Selenium Manager for drivers). Using outdated drivers can cause unexpected failures, especially if the browser auto-updated. If using Selenium Manager, it will handle driver updates; otherwise, have a process to update driver binaries. Also keep your browser versions in sync with what you test (don't test on a very old browser unless that's a requirement).

  • Use test data and secrets securely:

    If your tests require credentials or API keys, don't hardcode them in the code. Use configuration files (that aren't committed to VCS) or environment variables to supply them. For example, if testing login, read the username/password from a secure source. This is especially important for CI pipelines (most CI systems allow storing secrets). In context of Selenium, an example is not hardcoding TestingBot API keys in your script – instead, read from env variables or a config (which we'll discuss in the TestingBot section).

Incorporating these best practices will make your test suite more reliable and easier to maintain. For instance, using explicit waits and stable locators reduces flakiness (tests failing due to timing or minor DOM tweaks). The Page Object Model will pay off as your tests grow, by centralizing page-specific logic. Running tests in parallel and in a realistic variety of environments (browsers/devices) will give faster feedback and more confidence in your web application's quality.

Always review failing tests carefully. Sometimes failures indicate a real bug in the application (which is good, your tests caught it!). Other times, failures might indicate a test issue (timing, bad locator, etc.). Continuously refine your tests to eliminate false failures. A stable, fast, and comprehensive Selenium test suite is a huge asset for any web development team.

Using Selenium Python with TestingBot

TestingBot is a cloud-based Selenium Grid that allows you to run your tests on a wide range of browsers and devices. Instead of running tests on your local machine's browser, you can use TestingBot to execute them on different operating systems, various browser versions (including legacy ones), and mobile devices – all in the cloud. This is especially useful for cross-browser testing and for offloading the heavy lifting from your machine. Let's go through how to integrate Selenium Python tests with TestingBot.

Setup with TestingBot WebDriver Grid

First, you need a TestingBot account (sign up on their website). Once you have an account, you'll be provided with an API Key and Secret. These credentials are used to authenticate your Selenium sessions with their cloud.

On the coding side, you'll use Selenium's Remote WebDriver to connect to TestingBot's grid. No special Selenium library is needed beyond the standard selenium package, but TestingBot provides a helper client library (testingbotclient) that can be installed via pip if you want to use their REST API (for marking tests as passed/failed, etc.). This is optional for basic usage. At minimum, ensure you have:

  • selenium installed (as we did earlier).
  • Your TestingBot API key and secret on hand.

TestingBot's hub (grid) URL is: https://hub.testingbot.com/wd/hub. You will connect to this using your key and secret for authentication.

Authenticating with API key/secret

There are a couple of ways to supply your credentials:

  • In the URL:

    The simplest (though not the most secure) way is to embed them in the URL. For example: https://KEY:SECRET@hub.testingbot.com/wd/hub. Replacing KEY and SECRET with your values. This format passes basic auth to the Selenium hub.

  • Using environment variables or config:

    TestingBot allows you to set TESTINGBOT_KEY and TESTINGBOT_SECRET as environment variables instead of putting them in the URL. If you use their testingbotclient or certain integrations, these env vars will be picked up. You can also use a .testingbot config file to store credentials. The benefit of this approach is you don't expose the secret in logs or code. For instance, you might export these variables in your shell or CI environment, and then construct a URL without embedding creds (the TestingBot server will infer auth from the variables, or you supply them in code via desired capabilities, depending on the setup).

For initial simplicity, we'll show the URL method, but be mindful of security. In a real project, prefer using environment variables or a secure storage for the credentials, especially if your tests are in a public repository.

Running tests on real browsers/devices

To use TestingBot, you create a Remote WebDriver instead of a local one. You also specify desired capabilities indicating what browser/OS you want.

Here's a code example that opens a Chrome browser on Windows 10 via TestingBot:

Copy code
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

# Desired capabilities for the remote browser
caps = {
    "platform": "WIN10",
    "browserName": "chrome",
    "version": "latest" 
}

driver = webdriver.Remote(
    command_executor="https://YOUR_KEY:YOUR_SECRET@hub.testingbot.com/wd/hub",
    desired_capabilities=caps
)

driver.get("https://www.google.com")
print(driver.title)  # Should print "Google"
driver.quit()

This snippet will send a request to TestingBot's cloud to start a new session with the specified capabilities. The platform, browserName, and version are part of TestingBot's expected capability format. For example, platform "WINDOWS" with Chrome "latest" will pick the latest Chrome on a Windows machine in their cloud. You could specify specific versions (like "version": "90" to get Chrome 90) or other OS like "macOS" or specific mobile devices (with additional Appium capabilities for iOS/Android tests, though that ventures into mobile automation).

When this runs, your code will block until the remote browser is up and has navigated to Google. It's just like a local driver, except the actions happen on a remote browser. From your perspective, you still interact with driver the same way. The results (page title in this case) are returned to you. In a sense, it feels like a slightly slower local browser.

You can specify many different environments. For example:

  • Firefox on Windows: {"platform": "WINDOWS", "browserName": "firefox", "version": "latest-1"} (the latest-1 might mean one version behind latest).
  • Safari on macOS: {"platform": "MAC", "browserName": "safari", "version": "latest"}.
  • Edge on Windows 10: {"platform": "WIN10", "browserName": "edge", "version": "latest"}.

TestingBot's documentation will have a full list of options for platform names and browser versions. They often support legacy versions too, which is useful for ensuring compatibility.

One handy trick: you can name your tests and mark their success in the TestingBot dashboard. In our simple example, the test will show up with an ID. If you install testingbotclient, you can use it in tearDown to label the test and whether it passed. For example, in unittest, you could do:

Copy code
from testingbotclient import TestingBotClient

# in tearDown:
tb = TestingBotClient('YOUR_KEY', 'YOUR_SECRET')
tb.tests.update_test(self.driver.session_id, self._testMethodName, True/False)

This would mark the test name and status on their platform. This is optional, but helpful for tracking. If you skip this, you can still see video and logs of the session, but it might not know if your test passed or failed.

Running tests in parallel

One of the advantages of a cloud grid like TestingBot is easy parallelization. Instead of running tests sequentially on one machine, you can run multiple tests at the same time on different cloud machines. TestingBot allows a certain number of parallel sessions (depending on your subscription). For example, if your plan allows 4 parallel sessions, you could run 4 browsers simultaneously.

To run tests in parallel with TestingBot, you can simply launch multiple WebDriver sessions concurrently. How to do this depends on your test runner:

  • With pytest, you can use pytest -n (via xdist plugin) to distribute tests across processes. Each test that runs will invoke a separate webdriver.Remote and TestingBot will handle them in parallel on their side. For instance, pytest -n 3 could start 3 tests at once, each opening a browser in the cloud. Just ensure each test uses its own driver (which is the case if you use the fixture pattern or the pytest-selenium plugin with properly isolated contexts).
  • With unittest, parallelization isn't built-in, but you could use unittest-parallel or run multiple processes via CI or a test runner like nose or pytest (pytest can run unittest tests too). Alternatively, you could manage threads or multiprocessing in code to start different tests. However, using a framework feature (like pytest-xdist) is simpler.
  • You might also run tests in parallel by dividing them across multiple CI jobs, each running a subset on TestingBot (this is a coarse-grained parallelism, but effective if you have CI resources).

TestingBot's cloud can handle multiple sessions, so as long as your tests are set up to create separate WebDriver instances, you can do parallel. Each WebDriver instance communicates with the TestingBot hub independently. If you start two drivers at the same time (with correct credentials), you'll get two browser instances running concurrently in the cloud.

This allows you to drastically cut down test suite execution time, especially for large suites. For example, 10 tests that each take 1 minute could run in ~2 minutes if you run 5 in parallel at a time (assuming you have enough parallel slots).

Make sure your tests do not interfere with each other (they usually won't if they operate on different user accounts or data sets). Also, watch out for hitting any application rate limits or quotas if you slam the app with many simultaneous sessions.

TestingBot provides features like video recording, screenshots and logs for each session. When running in parallel, it's a good practice to give each test a name (as mentioned above) so you can easily identify which session corresponds to which test in their dashboard. This is invaluable for debugging if something fails on a specific browser.

An important best practice when using a service like TestingBot: Clean up your sessions. Always call driver.quit() in a finally or teardown. If you abort tests and leave sessions hanging, they will eventually timeout on TestingBot, but that could consume your parallel quota in the meantime. Proper teardown ensures you free up the slot for another test quickly.

Using Selenium with TestingBot is mostly about switching from local webdriver.Chrome() to webdriver.Remote(...) with the right capabilities and credentials. Once that's done, everything else in your test logic can remain the same. You gain the ability to run on many browser/OS combinations and to scale out your tests in parallel, which can significantly enhance your test coverage and speed. It's a powerful addition to your automation toolbox, especially for projects where cross-browser compatibility is important.

Ready to start testing?

Start a free trial