In this article, we'll focus on detecting and fixing Flaky Tests. A flaky test, also known as a "flaky test case" or "unstable test," is a test that produces inconsistent or unpredictable results across multiple test executions, even when the tested code hasn't changed.
Let's say you have a Selenium, Appium, Puppeteer or Playwright test that runs successfully most of the time. But 10% of the time, it fails for one or more specific reasons.
Flaky tests are bad for your CI/CD procedure, as it might prevent you from deploying new code. Below we'll provide some help in fixing and avoiding tests that are flaky.
What can cause flaky tests?
Flaky tests can arise due to various reasons, please see below for a list of most common reasons.
- Timing Issues: Tests that depend on specific timing conditions, such as network latency or UI element loading, can be sensitive to small timing variations, causing them to pass or fail intermittently.
- Concurrency and Parallelism: Tests that are not properly isolated or synchronised can interfere with each other when executed in parallel, leading to flaky/inconsistent outcomes.
- External Dependencies: Tests that rely on external services or resources, such as databases or APIs, can fail if those dependencies are unavailable or behave unexpectedly. For example, a HTTP call might time out, there may be TCP issues/delays or a database might be slower than usual.
- Race Conditions: If you have multiple threads or processes interact with shared resources, or multiple tests running at the same time, race conditions might occur. A race condition happens when more than 1 executor wants to access the same resource. This might lead to incorrect results, deadlocks or other problems with your tests.
- Data Fluctuations: Tests that use dynamically generated data, such as timestamps or random numbers, might produce different results in different runs, which might cause tests or programs that expect a certain format to fail.
In the following section, we'll take a look at how to fix the causes for flaky tests. Addressing flaky tests requires a combination of technical solutions and best practices.
How can I fix a flaky Selenium test?
Below are some steps on how to fix flaky Selenium tests. These are both best practices when it comes to creating and maintaining Selenium test, as well as tips to increase performance and reliability.
Timing and Waits in your Selenium tests
We've seen some tests where a sleep
function was used instead of an implicit or explicit wait. Using a fixed sleep is a bad idea:
- Your test might wait too long, causing your total test duration to increase for no good reason.
- Your test might not wait long enough, causing your test to fail
A better way to wait for an element to appear, or an action to finish, is to use implicit or explicit waits. These are functions that you can use to wait for a specific thing to happen. It does not require a fixed sleep duration, instead it requires you to specify a certain condition that needs to be met before the test will continue.
Use Stable Locators
Make sure to locate DOM elements in your test by using a locator strategy that is the most robust. For example, if you have a website and you want to interact with a specific element on the page, make sure to use an identifier that is the least unlikely to change.
Some websites use dynamically/random generated ids
for DOM elements. Using these in your tests is a recipe for a flaky test, as it will fail for sure. Instead, use a more stable locator such as an xpath
locator, or css locator
.
Add retry mechanisms
Some Selenium test frameworks include retry mechanisms out of the box. If you use a test framework that does not support this, consider adding a retry mechanism in your code. In case of an error or incorrect result, you might want to retry the logic that failed.
This is more like a band-aid approach, as it does not fix the underlying problem. We suggest using this approach as a last resort.
Use Selenium Grid and Parallelism
Using a Selenium Grid locally, or a Cloud-Based Selenium Grid helps to run multiple tests in parallel, on a variety of browsers. This will avoid tests that fail because the machine running the tests is under too much load. By distributing multiple tests on multiple machines, you will reduce the risk of flaky tests.
How can I fix a flaky Appium test?
Much like flaky Selenium tests, flaky Appium tests might also happen because of various reasons, the biggest reasons being:
- Using sleep instead of an implicit or explicit wait
- Bad usage of locators in your mobile app or website
To address these issues, we've compiled a list of things to look out for, with possible fixes.
Appium and Stable Locators
Rely on stable locators that are less prone to change across different test runs or app versions.
We recommend locating elements with a locator that is least likely to change.
- class name: find an element by its class name
- id: find an element by its id (which is unique)
- accessibility id: the accessibility id can be used to find an element. It is good practice to use accessibility IDs for all your UI elements.
-
-ios predicate string: Predicate Format Strings allow for basic matching of elements according to multiple element attributes, including
name
,value
,label
,type
,visible
and more. - -ios class chain: this is like a hybrid between XPath and iOS predicate strings, implemented by Appium to make hierarchical queries more performant. More information is available on the Class Chain Queries documentation page.
Implicit and Explicit Waits
Adding implicit and explicit waits in your Appium tests is a good idea if your test is required to wait on certain conditions to be met.
Remember that implicit waits are set globally for the entire session, affecting all interactions. Explicit waits offer more fine-grained control by allowing you to wait for specific elements and conditions.
Here's how you can use both types of waits with Appium:
Implicit Waits with Appium
Implicit waits set a default waiting time for Appium to wait for a certain period before interacting with elements. If an element is not immediately available, Appium will wait for the specified time before throwing an exception. Implicit waits apply to all elements and actions in your test script.
Explicit Waits with Appium
Explicit waits allow you to wait for a specific condition or element to be available before proceeding with the next step in your test script. Explicit waits give you more control over the wait conditions and the elements you're waiting for.
How can I fix a flaky Playwright test?
Playwright offers various built-in locators and comes with auto-waiting and retry-ability.
To make tests less flaky, Playwright recommends prioritising user-facing attributes and explicit contracts, such as for example page.getByRole()
.
Matching one of the two alternative locators with Playwright
If you want to select one element out of two or more options, but you're not sure which one will be available on the page, then you can use the locator.or()
method to generate a locator that matches any of the possible choices.
For instance, imagine a scenario where you intend to click on a "New product" button, but sometimes, a security settings dialog appears, preventing the click from succeeding. In this situation, you can wait for either the "New product" button or the dialog to appear, to take the necessary action accordingly.
const newProduct = page.getByRole('button', { name: 'New' });
const dialog = page.getByText('Confirm your purchase');
await expect(newProduct.or(dialog)).toBeVisible();
if (await dialog.isVisible())
await page.getByRole('button', { name: 'Ok' }).click();
await newProduct.click();
Matching two locators simultaneously with Playwright
Playwright provides a locator.and()
method, which narrows down an existing locator by matching an additional locator. For example, you can combine page.getByAltText()
and page.getByTitle()
to match by both alt text and title.
Chaining locators with Playwright
You can chain (combine) multiple locators, to narrow down the specific element you want to interact with.
For example, let's say you want to press a list item, but only when its title contains "Product-3":
This example will find all element with a listitem role, and find the one item that contains the required text.
How can I fix a flaky Puppeteer test?
We'll go over the most common causes for flakiness with Puppeteer tests, and how to prevent and fix these issues, to make your Puppeteer tests more reliable.
Use Proper Waits and Synchronization with Puppeteer
Flakiness often occurs when tests don't wait long enough for elements or events to appear or complete. Make sure that you use appropriate await
statements in your code. Wait for the right conditions before interacting with elements or making assertions in your tests.
Consider using page.waitForSelector
, page.waitForFunction
or other Puppeteer methods to wait for elements or conditions to be met.
Stabilize Network Requests
Use page.waitForNavigation
to wait for page navigation to complete or page.waitForResponse
to wait for specific network requests to finish. For example, sometimes you might have to wait for a Javascript file to complete loading on the page you are testing.
You can also use Puppeteer's page.setRequestInterception
to control network requests and stub/mock responses as required for testing.
Check for JavaScript Errors with Puppeteer
Use page.on('pageerror')
and page.on('console')
to capture and handle JavaScript errors and console logs during testing. Fixing these JavaScript errors may prevent test failures.
Update Puppeteer and other dependencies
Periodically update Puppeteer and other dependencies to take advantage of bug fixes and improvements that might address flakiness problems.
Implementing these best practices with Puppeteer can make your tests more reliable. It will reduce false positives and improve the general quality of your automated tests.
How TestingBot prevents Flaky tests.
TestingBot provides a grid of highly optimized virtual machines and physical devices, ready to run your tests. With years of experience, we've learned how to maximize the performance and reliability of our online service.
Minimising startup failures
When a user requests a new test session, TestingBot forwards the request to a single-use, brand new VM. In case of a failure when starting up the test, it will automatically retry on a brand new VM, without the user ever being aware.
Minimising failures during a test
TestingBot forwards all requests to an engine responsible for running the test. This may be one of the following engines: Selenium, Appium, Puppeteer, Playwright, Cypress, Espresso, XCUITest or others.
In case of a failure, TestingBot will automatically retry the request until it succeeds.