WebDriver drives a browser natively, as a user would, either locally
or on a remote machine using the Selenium server.
It marks a leap forward in terms of browser automation.
Selenium WebDriver refers to both the language bindings
and the implementations of the individual browser controlling code.
This is commonly referred to as just WebDriver.
WebDriver is designed as a simple
and more concise programming interface.
WebDriver is a compact object-oriented API.
It drives the browser effectively.
1 - Getting started
If you are new to Selenium, we have a few resources that can help you get up to speed right away.
Selenium supports automation of all the major browsers in the market
through the use of WebDriver.
WebDriver is an API and protocol that defines a language-neutral interface
for controlling the behaviour of web browsers.
Each browser is backed by a specific WebDriver implementation, called a driver.
The driver is the component responsible for delegating down to the browser,
and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors
take responsibility for the implementation for their browsers.
Selenium makes use of these third party drivers where possible,
but also provides its own drivers maintained by the project
for the cases when this is not a reality.
The Selenium framework ties all of these pieces together
through a user-facing interface that enables the different browser backends
to be used transparently,
enabling cross-browser and cross-platform automation.
Selenium setup is quite different from the setup of other commercial tools.
Before you can start writing Selenium code, you have to
install the language bindings libraries for your language of choice, the browser you
want to use, and the driver for that browser.
Follow the links below to get up and going with Selenium WebDriver.
If you wish to start with a low-code/record and playback tool, please check
Selenium IDE
Once you get things working, if you want to scale up your tests, check out the
Selenium Grid.
1.1 - Install a Selenium library
Setting up the Selenium library for your favourite programming language.
First you need to install the Selenium bindings for your automation project.
The installation process for libraries depends on the language you choose to use.
Make sure you check the Selenium downloads page to make sure
you are using the latest version.
Further items of note for using Visual Studio Code (vscode) and C#
Install the compatible .NET SDK as per the section above.
Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet.
Follow the instruction here
to create and run the “Hello World” console project using C#.
You may also create a NUnit starter project using the command line dotnet new NUnit.
Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues.
If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects.
Add the following section to the file nuget.config if it is empty:
For more info about nuget.configclick here.
You may have to customize nuget.config to meet you needs.
Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver.
Press Enter and select the version.
Now you can use the examples in the documentation related to C# with vscode.
You can see the minimum required version of Ruby for any given Selenium version
on rubygems.org
Everything Selenium does is send the browser commands to do something or send requests for information.
Most of what you’ll do with Selenium is a combination of these basic commands
Click on the link to “View full example on GitHub” to see the code in context.
1. Start the session
For more details on starting a session read our documentation on driver sessions
Synchronizing the code with the current state of the browser is one of the biggest challenges
with Selenium, and doing it well is an advanced topic.
Essentially you want to make sure that the element is on the page before you attempt to locate it
and the element is in an interactable state before you attempt to interact with it.
An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so
we’ll use it as a placeholder.
This ends the driver process, which by default closes the browser as well.
No more commands can be sent to this driver instance.
See Quitting Sessions.
Most Selenium users execute many sessions and need to organize them to minimize duplication and keep the code
more maintainable. Read on to learn about how to put this code into context for your use case with
Using Selenium.
1.3 - Organizing and Executing Selenium Code
Scaling Selenium execution with an IDE and a Test Runner library
Content Help
Note: This section needs additional and/or updated content
This page is very incomplete and has placeholders for things that need to be added or expounded on.
If you want to run more than a handful of one-off scripts, you need to
be able to organize and work with your code. This page should give you
ideas for how to actually do productive things with your Selenium code.
Common Uses
Most people use Selenium to execute automated tests for web applications,
but Selenium supports any use case of browser automation.
Repetitive Tasks
Perhaps you need to log into a website and download something, or submit a form.
You can create a Selenium script to run with a service at preset times.
Web Scraping
Are you looking to collect data from a site that doesn’t have an API? Selenium
will let you do this, but please make sure you are familiar with the website’s
terms of service as some websites do not permit it and others will even block Selenium.
Testing
Running Selenium for testing requires making assertions on actions taken by Selenium.
So a good assertion library is required. Additional features to provide structure for tests
require use of Test Runner.
IDEs
Regardless of how you use Selenium code,
you won’t be very effective writing or executing it without a good
Integrated Developer Environment. Here are some common options…
Even if you aren’t using Selenium for testing, if you have advanced use cases, it might make
sense to use a test runner to better organize your code. Being able to use before/after hooks
and run things in groups or in parallel can be very useful.
Choosing
There are many different test runners available.
All the code examples in this documentation can be found in (or is being moved to) our
example directories that use test runners and get executed every release to ensure all the code is correct and updated.
Here is a list of test runners with links. The first item is the one that is used by this repository and the one
that will be used for all examples on this page.
JUnit - A widely-used testing framework for Java-based Selenium tests.
TestNG - Offers extra features like parallel test execution and parameterized tests.
pytest - A preferred choice for many, thanks to its simplicity and powerful plugins.
unittest - Python’s standard library testing framework.
NUnit - A popular unit-testing framework for .NET.
RSpec - The most widely used testing library for running Selenium tests in Ruby.
Minitest - A lightweight testing framework that comes with Ruby standard library.
Jest - Primarily known as a testing framework for React, it can also be used for Selenium tests.
Mocha - The most common JS library for running Selenium tests.
Installing
This is very similar to what was required in Install a Selenium Library.
This code is only showing examples for what is being used in our Documentation Examples project.
Maven
Gradle
To use it in a project, add it to the requirements.txt file:
in the project’s csproj file, specify the dependency as a PackageReference in ItemGroup:
Add to project’s gemfile
In your project’s package.json, add requirement to dependencies:
The primary unique argument for starting a remote driver includes information about where to execute the code.
Read the details in the Remote Driver Section
In Selenium 3, capabilities were defined in a session by using Desired Capabilities classes.
As of Selenium 4, you must use the browser options classes.
For remote driver sessions, a browser options instance is required as it determines which browser will be used.
These options are described in the w3c specification for Capabilities.
Each browser has custom options that may be defined in addition to the ones defined in the specification.
browserName
Browser name is set by default when using an Options class instance.
This capability is optional, this is used to set the available browser version at remote end.
In recent versions of Selenium, if the version is not found on the system,
it will be automatically downloaded by Selenium Manager
Three types of page load strategies are available.
The page load strategy queries the
document.readyState
as described in the table below:
Strategy
Ready State
Notes
normal
complete
Used by default, waits for all resources to download
eager
interactive
DOM access is ready, but other resources like images may still be loading
none
Any
Does not block WebDriver at all
The document.readyState property of a document describes the loading state of the current document.
When navigating to a new page via URL, by default, WebDriver will hold off on completing a navigation
method (e.g., driver.navigate().get()) until the document ready state is complete. This does not
necessarily mean that the page has finished loading, especially for sites like Single Page Applications
that use JavaScript to dynamically load content after the Ready State returns complete. Note also
that this behavior does not apply to navigation that is a result of clicking an element or submitting a form.
If a page takes a long time to load as a result of downloading assets (e.g., images, css, js)
that aren’t important to the automation, you can change from the default parameter of normal to
eager or none to speed up the session. This value applies to the entire session, so make sure
that your waiting strategy is sufficient to minimize
flakiness.
normal (default)
WebDriver waits until the load
event fire is returned.
This capability checks whether an expired (or)
invalid TLS Certificate is used while navigating
during a session.
If the capability is set to false, an
insecure certificate error
will be returned as navigation encounters any domain
certificate problems. If set to true, invalid certificate will be
trusted by the browser.
All self-signed certificates will be trusted by this capability by default.
Once set, acceptInsecureCerts capability will have an
effect for the entire session.
A WebDriver session is imposed with a certain session timeout
interval, during which the user can control the behaviour
of executing scripts or retrieving information from the browser.
Each session timeout is configured with
combination of different timeouts as described below:
Script Timeout
Specifies when to interrupt an executing script in
a current browsing context. The default timeout 30,000
is imposed when a new session is created by WebDriver.
Specifies the time interval in which web page
needs to be loaded in a current browsing context.
The default timeout 300,000 is imposed when a
new session is created by WebDriver. If page load limits
a given/default time frame, the script will be stopped by
TimeoutException.
This specifies the time to wait for the
implicit element location strategy when
locating elements. The default timeout 0
is imposed when a new session is created by WebDriver.
Specifies the state of current session’s user prompt handler.
Defaults to dismiss and notify state
User Prompt Handler
This defines what action must take when a
user prompt encounters at the remote-end. This is defined by
unhandledPromptBehavior capability and has the following states:
This new capability indicates if strict interactability checks
should be applied to input type=file elements. As strict interactability
checks are off by default, there is a change in behaviour
when using Element Send Keys with hidden file upload controls.
A proxy server acts as an intermediary for
requests between a client and a server. In simple terms,
the traffic flows through the proxy server
on its way to the address you requested and back.
A proxy server for automation scripts
with Selenium could be helpful for:
Capture network traffic
Mock backend calls made by the website
Access the required website under complex network
topologies or strict corporate restrictions/policies.
If you are in a corporate environment, and a
browser fails to connect to a URL, this is most
likely because the environment needs a proxy to be accessed.
Selenium WebDriver provides a way to proxy settings:
The Service classes are for managing the starting and stopping of local drivers.
They cannot be used with a Remote WebDriver session.
Service classes allow you to specify information about the driver,
like location and which port to use.
They also let you specify what arguments get passed
to the command line. Most of the useful arguments are related to logging.
Default Service instance
To start a driver with a default service instance:
Note: If you are using Selenium 4.6 or greater, you shouldn’t need to set a driver location.
If you cannot update Selenium or have an advanced use case, here is how to specify the driver location:
Logging functionality varies between browsers. Most browsers allow you to
specify location and level of logs. Take a look at the respective browser page:
Selenium lets you automate browsers on remote computers if
there is a Selenium Grid running on them. The computer that
executes the code is referred to as the client computer, and the computer with the browser and driver is
referred to as the remote computer or sometimes as an end-node.
To direct Selenium tests to the remote computer, you need to use a Remote WebDriver class
and pass the URL including the port of the grid on that machine. Please see the grid documentation
for all the various ways the grid can be configured.
Basic Example
The driver needs to know where to send commands to and which browser to start on the Remote computer. So an address
and an options instance are both required.
Uploading a file is more complicated for Remote WebDriver sessions because the file you want to
upload is likely on the computer executing the code, but the driver on the
remote computer is looking for the provided path on its local file system.
The solution is to use a Local File Detector. When one is set, Selenium will bundle
the file, and send it to the remote machine, so the driver can see the reference to it.
Some bindings include a basic local file detector by default, and all of them allow
for a custom file detector.
Java does not include a Local File Detector by default, so you must always add one to do uploads.
Chrome, Edge and Firefox each allow you to set the location of the download directory.
When you do this on a remote computer, though, the location is on the remote computer’s local file system.
Selenium allows you to enable downloads to get these files onto the client computer.
Enable Downloads in the Grid
Regardless of the client, when starting the grid in node or standalone mode,
you must add the flag:
--enable-managed-downloads true
Enable Downloads in the Client
The grid uses the se:downloadsEnabled capability to toggle whether to be responsible for managing the browser location.
Each of the bindings have a method in the options class to set this.
Be aware that Selenium is not waiting for files to finish downloading,
so the list is an immediate snapshot of what file names are currently in the directory for the given session.
Each browser has implemented special functionality that is available only to that browser.
Each of the Selenium bindings has implemented a different way to use those features in a Remote Session
Java requires you to use the Augmenter class, which allows it to automatically pull in implementations for
all interfaces that match the capabilities used with the RemoteWebDriver
This feature is only available for Java client binding (Beta onwards). The Remote WebDriver client sends requests to the Selenium Grid server, which passes them to the WebDriver. Tracing should be enabled at the server and client-side to trace the HTTP requests end-to-end. Both ends should have a trace exporter setup pointing to the visualization framework.
By default, tracing is enabled for both client and server.
To set up the visualization framework Jaeger UI and Selenium Grid 4, please refer to Tracing Setup for the desired version.
For client-side setup, follow the steps below.
Add the required dependencies
Installation of external libraries for tracing exporter can be done using Maven.
Add the opentelemetry-exporter-jaeger and grpc-netty dependency in your project pom.xml:
Each browser has custom capabilities and unique features.
Note : If your device's date and language settings are set to Arabic, you must change the localization settings of your Java Virtual Machine (JVM) to prevent startup failures. Add the following arguments to your JVM :-Duser.language=en -Duser.region=US
3.1 - Chrome specific functionality
These are capabilities and features specific to Google Chrome browsers.
By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of
the Chrome browser and the version of chromedriver must match the major version.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
Chromedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: ChromeDriverService.CHROME_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property: Property key: ChromeDriverService.CHROME_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property: Property keys: ChromeDriverService.CHROME_DRIVER_APPEND_LOG_PROPERTY and ChromeDriverService.CHROME_DRIVER_READABLE_TIMESTAMP Property value: "true" or "false"
Chromedriver and Chrome browser versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Chrome.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property: Property key: ChromeDriverService.CHROME_DRIVER_DISABLE_BUILD_CHECK Property value: "true" or "false"
ChromiumNetworkConditionsnetworkConditions=newChromiumNetworkConditions();networkConditions.setOffline(false);networkConditions.setLatency(java.time.Duration.ofMillis(20));// 20 ms of latencynetworkConditions.setDownloadThroughput(2000*1024/8);// 2000 kbpsnetworkConditions.setUploadThroughput(2000*1024/8);// 2000 kbps((ChromeDriver)driver).setNetworkConditions(networkConditions);
network_conditions={"offline":False,"latency":20,# 20 ms of latency"download_throughput":2000*1024/8,# 2000 kbps"upload_throughput":2000*1024/8,# 2000 kbps}driver.set_network_conditions(**network_conditions)
See the Chrome DevTools section for more information about using Chrome DevTools
3.2 - Edge specific functionality
These are capabilities and features specific to Microsoft Edge browsers.
Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome,
the major version number of edgedriver must match the major version of the Edge browser.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
MSEdgedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: EdgeDriverService.EDGE_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property: Property key: EdgeDriverService.EDGE_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property: Property keys: EdgeDriverService.EDGE_DRIVER_APPEND_LOG_PROPERTY and EdgeDriverService.EDGE_DRIVER_READABLE_TIMESTAMP Property value: "true" or "false"
Edge browser and msedgedriver versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Edge.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property: Property key: EdgeDriverService.EDGE_DRIVER_DISABLE_BUILD_CHECK Property value: "true" or "false"
Microsoft Edge can be driven in “Internet Explorer Compatibility Mode”, which uses
the Internet Explorer Driver classes in conjunction with Microsoft Edge.
Read the Internet Explorer page for more details.
Special Features
Some browsers have implemented additional features that are unique to them.
Casting
You can drive Chrome Cast devices with Edge, including sharing tabs
ChromiumNetworkConditionsnetworkConditions=newChromiumNetworkConditions();networkConditions.setOffline(false);networkConditions.setLatency(java.time.Duration.ofMillis(20));// 20 ms of latencynetworkConditions.setDownloadThroughput(2000*1024/8);// 2000 kbpsnetworkConditions.setUploadThroughput(2000*1024/8);// 2000 kbps
network_conditions={"offline":False,"latency":20,# 20 ms of latency"download_throughput":2000*1024/8,# 2000 kbps"upload_throughput":2000*1024/8,# 2000 kbps}driver.set_network_conditions(**network_conditions)
The args parameter is for a list of Command line switches used when starting the browser. Commonly used args include -headless and "-profile", "/path/to/profile"
The binary parameter takes the path of an alternate location of browser to use. For example, with this parameter you can
use geckodriver to drive Firefox Nightly instead of the production version when both are present on your computer.
const{Builder}=require("selenium-webdriver");constfirefox=require('selenium-webdriver/firefox');constoptions=newfirefox.Options();letprofile='/path to custom profile';options.setProfile(profile);constdriver=newBuilder().forBrowser('firefox').setFirefoxOptions(options).build();
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: GeckoDriverService.GECKO_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property: Property key: GeckoDriverService.GECKO_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of FirefoxDriverLogLevel enum
The driver logs everything that gets sent to it, including string representations of large binaries, so
Firefox truncates lines by default. To turn off truncation:
Note: Java also allows setting log level by System Property: Property key: GeckoDriverService.GECKO_DRIVER_LOG_NO_TRUNCATE Property value: "true" or "false"
The default directory for profiles is the system temporary directory. If you do not have access to that directory,
or want profiles to be created some place specific, you can change the profile root directory:
When working with an unfinished or unpublished extension, it will likely not be signed. As such, it can only
be installed as “temporary.” This can be done by passing in either a zip file or a directory, here’s an
example with a directory:
These are capabilities and features specific to Microsoft Internet Explorer browsers.
As of June 2022, Selenium officially no longer supports standalone Internet Explorer.
The Internet Explorer driver still supports running Microsoft Edge in “IE Compatibility Mode.”
Special considerations
The IE Driver is the only driver maintained by the Selenium Project directly.
While binaries for both the 32-bit and 64-bit
versions of Internet Explorer are available, there are some
known limitations
with the 64-bit driver. As such it is recommended to use the 32-bit driver.
Additional information about using Internet Explorer can be found on the
IE Driver Server page
Options
Starting a Microsoft Edge browser in Internet Explorer Compatibility mode with basic defined options looks like this:
If IE is not present on the system (default in Windows 11), you do not need to
use the two parameters above. IE Driver will use Edge and will automatically locate it.
If IE and Edge are both present on the system, you only need to set attaching to Edge,
IE Driver will automatically locate Edge on your system.
Here are a few common use cases with different capabilities:
fileUploadDialogTimeout
In some environments, Internet Explorer may timeout when opening the
File Upload dialog. IEDriver has a default timeout of 1000ms, but you
can increase the timeout using the fileUploadDialogTimeout capability.
When set to true, this capability clears the Cache,
Browser History and Cookies for all running instances
of InternetExplorer including those started manually
or by the driver. By default, it is set to false.
Using this capability will cause performance drop while
launching the browser, as the driver will wait until the cache
gets cleared before launching the IE browser.
This capability accepts a Boolean value as parameter.
InternetExplorer driver expects the browser zoom level to be 100%,
else the driver will throw an exception. This default behaviour
can be disabled by setting the ignoreZoomSetting to true.
This capability accepts a Boolean value as parameter.
Whether to skip the Protected Mode check while launching
a new IE session.
If not set and Protected Mode settings are not same for
all zones, an exception will be thrown by the driver.
If capability is set to true, tests may
become flaky, unresponsive, or browsers may hang.
However, this is still by far a second-best choice,
and the first choice should always be to actually
set the Protected Mode settings of each zone manually.
If a user is using this property,
only a “best effort” at support will be given.
This capability accepts a Boolean value as parameter.
Internet Explorer includes several command-line options
that enable you to troubleshoot and configure the browser.
The following describes few supported command-line options
-private : Used to start IE in private browsing mode. This works for IE 8 and later versions.
-k : Starts Internet Explorer in kiosk mode.
The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.
-extoff : Starts IE in no add-on mode.
This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.
Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: InternetExplorerDriverService.IE_DRIVER_LOGFILE_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property: Property key: InternetExplorerDriverService.IE_DRIVER_LOGLEVEL_PROPERTY Property value: String representation of InternetExplorerDriverLogLevel.DEBUG.toString() enum