WebDriver drives a browser natively, as a user would, either locally
or on a remote machine using the Selenium server,
marks a leap forward in terms of browser automation.
Selenium WebDriver refers to both the language bindings
and the implementations of the individual browser controlling code.
This is commonly referred to as just WebDriver.
WebDriver is designed as a simple
and more concise programming interface.
WebDriver is a compact object-oriented API.
It drives the browser effectively.
1 - Getting started
If you are new to Selenium, we have a few resources that can help you get up to speed right away.
Selenium supports automation of all the major browsers in the market
through the use of WebDriver.
WebDriver is an API and protocol that defines a language-neutral interface
for controlling the behaviour of web browsers.
Each browser is backed by a specific WebDriver implementation, called a driver.
The driver is the component responsible for delegating down to the browser,
and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors
take responsibility for the implementation for their browsers.
Selenium makes use of these third party drivers where possible,
but also provides its own drivers maintained by the project
for the cases when this is not a reality.
The Selenium framework ties all of these pieces together
through a user-facing interface that enables the different browser backends
to be used transparently,
enabling cross-browser and cross-platform automation.
Selenium setup is quite different from the setup of other commercial tools.
Before you can start writing Selenium code, you have to
install the language bindings libraries for your language of choice, the browser you
want to use, and the driver for that browser.
Follow the links below to get up and going with Selenium WebDriver.
If you wish to start with a low-code/record and playback tool, please check
Selenium IDE
Once you get things working, if you want to scale up your tests, check out the
Selenium Grid.
1.1 - Install a Selenium library
Setting up the Selenium library for your favourite programming language.
First you need to install the Selenium bindings for your automation project.
The installation process for libraries depends on the language you choose to use.
Make sure you check the Selenium downloads page to make sure
you are using the latest version.
Further items of note for using Visual Studio Code (vscode) and C#
Install the compatible .NET SDK as per the section above.
Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet.
Follow the instruction here
to create and run the “Hello World” console project using C#.
You may also create a NUnit starter project using the command line dotnet new NUnit.
Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues.
If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects.
Add the following section to the file nuget.config if it is empty:
For more info about nuget.configclick here.
You may have to customize nuget.config to meet you needs.
Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver.
Press Enter and select the version.
Now you can use the examples in the documentation related to C# with vscode.
You can see the minimum required version of Ruby for any given Selenium version
on rubygems.org
Note: if you get an error about drivers not found, please read about troubleshooting the
driver location error
Eight Basic Components
Everything Selenium does is send the browser commands to do something or send requests for information.
Most of what you’ll do with Selenium is a combination of these basic commands:
1. Start the session
For more details on starting a session read our documentation on driver sessions
Synchronizing the code with the current state of the browser is one of the biggest challenges
with Selenium, and doing it well is an advanced topic.
Essentially you want to make sure that the element is on the page before you attempt to locate it
and the element is in an interactable state before you attempt to interact with it.
An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so
we’ll use it as a placeholder.
If you are using Selenium for testing,
you will want to execute your Selenium code using test runner tools.
Many of the code examples in this documentation can be found in our example repositories.
There are multiple options in each language, but here is what we are using in our examples:
Install JUnit 5 test runner using a build tool.
Maven
In the project’s pom.xml file, specify the dependency:
Take what you’ve learned and build out your Selenium code.
As you find more functionality that you need, read up on the rest of our
WebDriver documentation.
1.3 - Upgrade to Selenium 4
Are you still using Selenium 3? This guide will help you upgrade to the latest release!
Upgrading to Selenium 4 should be a painless process if you are using one of the officially
supported languages (Ruby, JavaScript, C#, Python, and Java). There might be some cases where
a few issues can happen, and this guide will help you to sort them out. We will go through
the steps to upgrade your project dependencies and understand the major deprecations and
changes the version upgrade brings.
These are the steps we will follow to upgrade to Selenium 4:
Preparing our test code
Upgrading dependencies
Potential errors and deprecation messages
Note: while Selenium 3.x versions were being developed, support for the W3C WebDriver standard
was implemented. Both this new protocol and the legacy JSON Wire Protocol were supported. Around
version 3.11, Selenium code became compliant with the level W3C 1 specification. The W3C compliant
code in the latest version of Selenium 3 will work as expected in Selenium 4.
Preparing our test code
Selenium 4 removes support for the legacy protocol and uses the W3C WebDriver standard by
default under the hood. For most things, this implementation will not affect end users.
The major exceptions are Capabilities and the Actions class.
Capabilities
If the test capabilities are not structured to be W3C compliant, may cause a session to not
be started. Here is the list of W3C WebDriver standard capabilities:
browserName
browserVersion (replaces version)
platformName (replaces platform)
acceptInsecureCerts
pageLoadStrategy
proxy
timeouts
unhandledPromptBehavior
An up-to-date list of standard capabilities can be found at
W3C WebDriver.
Any capability that is not contained in the list above, needs to include a vendor prefix.
This applies to browser specific capabilities as well as cloud vendor specific capabilities.
For example, if your cloud vendor uses build and name capabilities for your tests, you need
to wrap them in a cloud:options block (check with your cloud vendor for the appropriate prefix).
The utility methods to find elements in the Java bindings (FindsBy interfaces) have been removed
as they were meant for internal use only. The following code samples explain this better.
Check the subsections below to install Selenium 4 and have your project dependencies upgraded.
Java
The process of upgrading Selenium depends on which build tool is being used. We will cover the
most common ones for Java, which are Maven and
Gradle. The minimum Java version required is still 8.
Maven
Before
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>3.141.59</version></dependency><!-- more dependencies ... --></dependencies>
After
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>4.4.0</version></dependency><!-- more dependencies ... --></dependencies>
After making the change, you could execute mvn clean compile on the same directory where the
pom.xml file is.
Gradle
Before
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '3.141.59'
}
test {
useJUnitPlatform()
}
After
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.4.0'
}
test {
useJUnitPlatform()
}
After making the change, you could execute ./gradlew clean build
on the same directory where the build.gradle file is.
To check all the Java releases, you can head to MVNRepository.
C#
The place to get updates for Selenium 4 in C# is NuGet. Under the
Selenium.WebDriver package you
can get the instructions to update to the latest version. Inside of Visual Studio, through the
NuGet Package Manager you can execute:
The most important change to use Python is the minimum required version. Selenium 4 will
require a minimum Python 3.7 or higher. More details can be found at the
Python Package Index. To upgrade from the
command line, you can execute:
pip install selenium==4.4.3
Ruby
The update details for Selenium 4 can be seen at the
selenium-webdriver
gem in RubyGems. To install the latest version, you can execute:
gem install selenium-webdriver
To add it to your Gemfile:
gem 'selenium-webdriver', '~> 4.4.0'
JavaScript
The selenium-webdriver package can be found at the Node package manager,
npmjs. Selenium 4 can be found
here. To install it, you
could either execute:
Waits are also expecting different parameters now. WebDriverWait is now expecting a Duration
instead of a long for timeout in seconds and milliseconds. The withTimeout and pollingEvery
utility methods from FluentWait have switched from expecting (long time, TimeUnit unit) to
expect (Duration duration).
Merging capabilities is no longer changing the calling object
It was possible to merge a different set of capabilities into another set, and it was
mutating the calling object. Now, the result of the merge operation needs to be assigned.
Before
MutableCapabilitiescapabilities=newMutableCapabilities();capabilities.setCapability("platformVersion","Windows 10");FirefoxOptionsoptions=newFirefoxOptions();options.setHeadless(true);options.merge(capabilities);// As a result, the `options` object was getting modified.
After
MutableCapabilitiescapabilities=newMutableCapabilities();capabilities.setCapability("platformVersion","Windows 10");FirefoxOptionsoptions=newFirefoxOptions();options.setHeadless(true);options=options.merge(capabilities);// The result of the `merge` call needs to be assigned to an object.
Firefox Legacy
Before GeckoDriver was around, the Selenium project had a driver implementation to automate
Firefox (version <48). However, this implementation is not needed anymore as it does not work
in recent versions of Firefox. To avoid major issues when upgrading to Selenium 4, the setLegacy
option will be shown as deprecated. The recommendation is to stop using the old implementation
and rely only on GeckoDriver. The following code will show the setLegacy line deprecated after
upgrading.
Instead of it, AddAdditionalOption is recommended. Here is an example showing this:
Before
var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalCapability("cloud:options", cloudOptions, true);
After
var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalOption("cloud:options", cloudOptions);
Python
executable_path has been deprecated, please pass in a Service object
In Selenium 4, you’ll need to set the driver’s executable_path from a Service object to prevent deprecation warnings. (Or don’t set the path and instead make sure that the driver you need is on the System PATH.)
We went through the major changes to be taken into consideration when upgrading to Selenium 4.
Covering the different aspects to cover when test code is prepared for the upgrade, including
suggestions on how to prevent potential issues that can show up when using the new version of
Selenium. To finalize, we also covered a set of possible issues that you can bump into after
upgrading, and we shared potential fixes for those issues.
The primary unique argument for starting a remote driver includes information about where to execute the code.
Read the details in the Remote Driver Section
In Selenium 3, capabilities were defined in a session by using Desired Capabilities classes.
As of Selenium 4, you must use the browser options classes.
For remote driver sessions, a browser options instance is required as it determines which browser will be used.
These options are described in the w3c specification for Capabilities.
Each browser has custom options that may be defined in addition to the ones defined in the specification.
browserName
Browser name is set by default when using an Options class instance.
This capability is optional, this is used to set the available browser version at remote end.
In recent versions of Selenium, if the version is not found on the system,
it will be automatically downloaded by Selenium Manager
Three types of page load strategies are available.
The page load strategy queries the
document.readyState
as described in the table below:
Strategy
Ready State
Notes
normal
complete
Used by default, waits for all resources to download
eager
interactive
DOM access is ready, but other resources like images may still be loading
none
Any
Does not block WebDriver at all
The document.readyState property of a document describes the loading state of the current document.
When navigating to a new page via URL, by default, WebDriver will hold off on completing a navigation
method (e.g., driver.navigate().get()) until the document ready state is complete. This does not
necessarily mean that the page has finished loading, especially for sites like Single Page Applications
that use JavaScript to dynamically load content after the Ready State returns complete. Note also
that this behavior does not apply to navigation that is a result of clicking an element or submitting a form.
If a page takes a long time to load as a result of downloading assets (e.g., images, css, js)
that aren’t important to the automation, you can change from the default parameter of normal to
eager or none to speed up the session. This value applies to the entire session, so make sure
that your waiting strategy is sufficient to minimize
flakiness.
normal (default)
WebDriver waits until the load
event fire is returned.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
it('Navigate using normal page loading strategy',asyncfunction(){letdriver=awaitenv.builder().setChromeOptions(options.setPageLoadStrategy('normal')).build();awaitdriver.get('https://www.selenium.dev/selenium/web/blank.html');
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
This capability checks whether an expired (or)
invalid TLS Certificate is used while navigating
during a session.
If the capability is set to false, an
insecure certificate error
will be returned as navigation encounters any domain
certificate problems. If set to true, invalid certificate will be
trusted by the browser.
All self-signed certificates will be trusted by this capability by default.
Once set, acceptInsecureCerts capability will have an
effect for the entire session.
A WebDriver session is imposed with a certain session timeout
interval, during which the user can control the behaviour
of executing scripts or retrieving information from the browser.
Each session timeout is configured with
combination of different timeouts as described below:
Script Timeout
Specifies when to interrupt an executing script in
a current browsing context. The default timeout 30,000
is imposed when a new session is created by WebDriver.
Specifies the time interval in which web page
needs to be loaded in a current browsing context.
The default timeout 300,000 is imposed when a
new session is created by WebDriver. If page load limits
a given/default time frame, the script will be stopped by
TimeoutException.
This specifies the time to wait for the
implicit element location strategy when
locating elements. The default timeout 0
is imposed when a new session is created by WebDriver.
Specifies the state of current session’s user prompt handler.
Defaults to dismiss and notify state
User Prompt Handler
This defines what action must take when a
user prompt encounters at the remote-end. This is defined by
unhandledPromptBehavior capability and has the following states:
This new capability indicates if strict interactability checks
should be applied to input type=file elements. As strict interactability
checks are off by default, there is a change in behaviour
when using Element Send Keys with hidden file upload controls.
A proxy server acts as an intermediary for
requests between a client and a server. In simple,
the traffic flows through the proxy server
on its way to the address you requested and back.
A proxy server for automation scripts
with Selenium could be helpful for:
Capture network traffic
Mock backend calls made by the website
Access the required website under complex network
topologies or strict corporate restrictions/policies.
If you are in a corporate environment, and a
browser fails to connect to a URL, this is most
likely because the environment needs a proxy to be accessed.
Selenium WebDriver provides a way to proxy settings:
The Service classes are for managing the starting and stopping of local drivers.
They cannot be used with a Remote WebDriver session.
Service classes allow you to specify information about the driver,
like location and which port to use.
They also let you specify what arguments get passed
to the command line. Most of the useful arguments are related to logging.
Default Service instance
To start a driver with a default service instance:
Note: If you are using Selenium 4.6 or greater, you shouldn’t need to set a driver location.
If you cannot update Selenium or have an advanced use case, here is how to specify the driver location:
Logging functionality varies between browsers. Most browsers allow you to
specify location and level of logs. Take a look at the respective browser page:
You can use WebDriver remotely the same way you would use it
locally. The primary difference is that a remote WebDriver needs to be
configured so that it can run your tests on a separate machine.
A remote WebDriver is composed of two pieces: a client and a
server. The client is your WebDriver test and the server is simply a
Java servlet, which can be hosted in any modern JEE app server.
To run a remote WebDriver client, we first need to connect to the RemoteWebDriver.
We do this by pointing the URL to the address of the server running our tests.
In order to customize our configuration, we set desired capabilities.
Below is an example of instantiating a remote WebDriver object
pointing to our remote web server, www.example.com,
running our tests on Firefox.
The Local File Detector allows the transfer of files from the client
machine to the remote server. For example, if a test needs to upload a
file to a web application, a remote WebDriver can automatically transfer
the file from the local machine to the remote web server during
runtime. This allows the file to be uploaded from the remote machine
running the test. It is not enabled by default and can be enabled in
the following way:
This feature is only available for Java client binding (Beta onwards). The Remote WebDriver client sends requests to the Selenium Grid server, which passes them to the WebDriver. Tracing should be enabled at the server and client-side to trace the HTTP requests end-to-end. Both ends should have a trace exporter setup pointing to the visualization framework.
By default, tracing is enabled for both client and server.
To set up the visualization framework Jaeger UI and Selenium Grid 4, please refer to Tracing Setup for the desired version.
For client-side setup, follow the steps below.
Add the required dependencies
Installation of external libraries for tracing exporter can be done using Maven.
Add the opentelemetry-exporter-jaeger and grpc-netty dependency in your project pom.xml:
Each browser has custom capabilities and unique features.
3.1 - Chrome specific functionality
These are capabilities and features specific to Google Chrome browsers.
By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of
the Chrome browser and the version of chromedriver must match the major version.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
Chromedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: ChromeDriverService.CHROME_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property: Property key: ChromeDriverService.CHROME_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property: Property keys: ChromeDriverService.CHROME_DRIVER_APPEND_LOG_PROPERTY and ChromeDriverService.CHROME_DRIVER_READABLE_TIMESTAMP Property value: "true" or "false"
Chromedriver and Chrome browser versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Chrome.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property: Property key: ChromeDriverService.CHROME_DRIVER_DISABLE_BUILD_CHECK Property value: "true" or "false"
See the Chrome DevTools section for more information about using Chrome DevTools
3.2 - Edge specific functionality
These are capabilities and features specific to Microsoft Edge browsers.
Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome,
the major version number of edgedriver must match the major version of the Edge browser.
Options
Capabilities common to all browsers are described on the Options page.
The args parameter is for a list of command line switches to be used when starting the browser.
There are two excellent resources for investigating these arguments:
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
MSEdgedriver has several default arguments it uses to start the browser.
If you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on. A full list of default arguments
can be parsed from the
Chromium Source Code
Examples for creating a default Service object, and for setting driver location and port
can be found on the Driver Service page.
Log output
Getting driver logs can be helpful for debugging issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: EdgeDriverService.EDGE_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
There are 6 available log levels: ALL, DEBUG, INFO, WARNING, SEVERE, and OFF.
Note that --verbose is equivalent to --log-level=ALL and --silent is equivalent to --log-level=OFF,
so this example is just setting the log level generically:
Note: Java also allows setting log level by System Property: Property key: EdgeDriverService.EDGE_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of ChromiumDriverLogLevel enum
There are 2 features that are only available when logging to a file:
append log
readable timestamps
To use them, you need to also explicitly specify the log path and log level.
The log output will be managed by the driver, not the process, so minor differences may be seen.
Note: Java also allows toggling these features by System Property: Property keys: EdgeDriverService.EDGE_DRIVER_APPEND_LOG_PROPERTY and EdgeDriverService.EDGE_DRIVER_READABLE_TIMESTAMP Property value: "true" or "false"
Edge browser and msedgedriver versions should match, and if they don’t the driver will error.
If you disable the build check, you can force the driver to be used with any version of Edge.
Note that this is an unsupported feature, and bugs will not be investigated.
Note: Java also allows disabling build checks by System Property: Property key: EdgeDriverService.EDGE_DRIVER_DISABLE_BUILD_CHECK Property value: "true" or "false"
Microsoft Edge can be driven in “Internet Explorer Compatibility Mode”, which uses
the Internet Explorer Driver classes in conjunction with Microsoft Edge.
Read the Internet Explorer page for more details.
Special Features
Some browsers have implemented additional features that are unique to them.
Casting
You can drive Chrome Cast devices with Edge, including sharing tabs
The args parameter is for a list of Command line switches used when starting the browser. Commonly used args include -headless and "-profile", "/path/to/profile"
The binary parameter takes the path of an alternate location of browser to use. For example, with this parameter you can
use geckodriver to drive Firefox Nightly instead of the production version when both are present on your computer.
const{Builder}=require("selenium-webdriver");constfirefox=require('selenium-webdriver/firefox');constoptions=newfirefox.Options();letprofile='/path to custom profile';options.setProfile(profile);constdriver=newBuilder().forBrowser('firefox').setFirefoxOptions(options).build();
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: GeckoDriverService.GECKO_DRIVER_LOG_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property: Property key: GeckoDriverService.GECKO_DRIVER_LOG_LEVEL_PROPERTY Property value: String representation of FirefoxDriverLogLevel enum
The driver logs everything that gets sent to it, including string representations of large binaries, so
Firefox truncates lines by default. To turn off truncation:
Note: Java also allows setting log level by System Property: Property key: GeckoDriverService.GECKO_DRIVER_LOG_NO_TRUNCATE Property value: "true" or "false"
The default directory for profiles is the system temporary directory. If you do not have access to that directory,
or want profiles to be created some place specific, you can change the profile root directory:
When working with an unfinished or unpublished extension, it will likely not be signed. As such, it can only
be installed as “temporary.” This can be done by passing in either a zip file or a directory, here’s an
example with a directory:
These are capabilities and features specific to Microsoft Internet Explorer browsers.
As of June 2022, Selenium officially no longer supports standalone Internet Explorer.
The Internet Explorer driver still supports running Microsoft Edge in “IE Compatibility Mode.”
Special considerations
The IE Driver is the only driver maintained by the Selenium Project directly.
While binaries for both the 32-bit and 64-bit
versions of Internet Explorer are available, there are some
known limitations
with the 64-bit driver. As such it is recommended to use the 32-bit driver.
Additional information about using Internet Explorer can be found on the
IE Driver Server page
Options
Starting a Microsoft Edge browser in Internet Explorer Compatibility mode with basic defined options looks like this:
If IE is not present on the system (default in Windows 11), you do not need to
use the two parameters above. IE Driver will use Edge and will automatically locate it.
If IE and Edge are both present on the system, you only need to set attaching to Edge,
IE Driver will automatically locate Edge on your system.
Here are a few common use cases with different capabilities:
fileUploadDialogTimeout
In some environments, Internet Explorer may timeout when opening the
File Upload dialog. IEDriver has a default timeout of 1000ms, but you
can increase the timeout using the fileUploadDialogTimeout capability.
When set to true, this capability clears the Cache,
Browser History and Cookies for all running instances
of InternetExplorer including those started manually
or by the driver. By default, it is set to false.
Using this capability will cause performance drop while
launching the browser, as the driver will wait until the cache
gets cleared before launching the IE browser.
This capability accepts a Boolean value as parameter.
InternetExplorer driver expects the browser zoom level to be 100%,
else the driver will throw an exception. This default behaviour
can be disabled by setting the ignoreZoomSetting to true.
This capability accepts a Boolean value as parameter.
Whether to skip the Protected Mode check while launching
a new IE session.
If not set and Protected Mode settings are not same for
all zones, an exception will be thrown by the driver.
If capability is set to true, tests may
become flaky, unresponsive, or browsers may hang.
However, this is still by far a second-best choice,
and the first choice should always be to actually
set the Protected Mode settings of each zone manually.
If a user is using this property,
only a “best effort” at support will be given.
This capability accepts a Boolean value as parameter.
<p><ahref=/documentation/about/contributing/#creating-examples><spanclass="selenium-badge-code"data-bs-toggle="tooltip"data-bs-placement="right"title="This code example is missing. Examples are added to the examples directory; click for details in the contribution guide">AddExample</span></a></p>
Internet Explorer includes several command-line options
that enable you to troubleshoot and configure the browser.
The following describes few supported command-line options
-private : Used to start IE in private browsing mode. This works for IE 8 and later versions.
-k : Starts Internet Explorer in kiosk mode.
The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.
-extoff : Starts IE in no add-on mode.
This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.
Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.
Service settings common to all browsers are described on the Service page.
Log output
Getting driver logs can be helpful for debugging various issues. The Service class lets you
direct where the logs will go. Logging output is ignored unless the user directs it somewhere.
File output
To change the logging output to save to a specific file:
Note: Java also allows setting console output by System Property; Property key: InternetExplorerDriverService.IE_DRIVER_LOGFILE_PROPERTY Property value: DriverService.LOG_STDOUT or DriverService.LOG_STDERR
Note: Java also allows setting log level by System Property: Property key: InternetExplorerDriverService.IE_DRIVER_LOGLEVEL_PROPERTY Property value: String representation of InternetExplorerDriverLogLevel.DEBUG.toString() enum
These are capabilities and features specific to Apple Safari browsers.
Unlike Chromium and Firefox drivers, the safaridriver is installed with the Operating System.
To enable automation on Safari, run the following command from the terminal:
safaridriver --enable
Options
Capabilities common to all browsers are described on the Options page.
Those looking to automate Safari on iOS should look to the Appium project.
Service
Service settings common to all browsers are described on the Service page.
Logging
Unlike other browsers, Safari doesn’t let you choose where logs are output, or change levels. The one option
available is to turn logs off or on. If logs are toggled on, they can be found at:~/Library/Logs/com.apple.WebDriver/.
Note: Java also allows setting console output by System Property; Property key: SafariDriverService.SAFARI_DRIVER_LOGGING Property value: "true" or "false"
Perhaps the most common challenge for browser automation is ensuring
that the web application is in a state to execute a particular
Selenium command as desired. The processes often end up in
a race condition where sometimes the browser gets into the right
state first (things work as intended) and sometimes the Selenium code
executes first (things do not work as intended). This is one of the
primary causes of flaky tests.
All navigation commands wait for a specific readyState value
based on the page load strategy (the
default value to wait for is "complete") before the driver returns control to the code.
The readyState only concerns itself with loading assets defined in the HTML,
but loaded JavaScript assets often result in changes to the site,
and elements that need to be interacted with may not yet be on the page
when the code is ready to execute the next Selenium command.
Similarly, in a lot of single page applications, elements get dynamically
added to a page or change visibility based on a click.
An element must be both present and
displayed on the page
in order for Selenium to interact with it.
Take this page for example: https://www.selenium.dev/selenium/web/dynamic.html
When the “Add a box!” button is clicked, a “div” element that does not exist is created.
When the “Reveal a new input” button is clicked, a hidden text field element is displayed.
In both cases the transition takes a couple seconds.
If the Selenium code is to click one of these buttons and interact with the resulting element,
it will do so before that element is ready and fail.
The first solution many people turn to is adding a sleep statement to
pause the code execution for a set period of time.
Because the code can’t know exactly how long it needs to wait, this
can fail when it doesn’t sleep long enough. Alternately, if the value is set too high
and a sleep statement is added in every place it is needed, the duration of
the session can become prohibitive.
Selenium provides two different mechanisms for synchronization that are better.
Implicit waits
Selenium has a built-in way to automatically wait for elements called an implicit wait.
An implicit wait value can be set either with the timeouts
capability in the browser options, or with a driver method (as shown below).
This is a global setting that applies to every element location call for the entire session.
The default value is 0, which means that if the element is not found, it will
immediately return an error. If an implicit wait is set, the driver will wait for the
duration of the provided value before returning the error. Note that as soon as the
element is located, the driver will return the element reference and the code will continue executing,
so a larger implicit wait value won’t necessarily increase the duration of the session.
Warning:
Do not mix implicit and explicit waits.
Doing so can cause unpredictable wait times.
For example, setting an implicit wait of 10 seconds
and an explicit wait of 15 seconds
could cause a timeout to occur after 20 seconds.
Solving our example with an implicit wait looks like this:
Explicit waits are loops added to the code that poll the application
for a specific condition to evaluate as true before it exits the loop and
continues to the next command in the code. If the condition is not met before a designated timeout value,
the code will give a timeout error. Since there are many ways for the application not to be in the desired state,
so explicit waits are a great choice to specify the exact condition to wait for
in each place it is needed.
Another nice feature is that, by default, the Selenium Wait class automatically waits for the designated element to exist.
This example shows the condition being waited for as a lambda. Java also supports
Expected Conditions
The Wait class can be instantiated with various parameters that will change how the conditions are evaluated.
This can include:
Changing how often the code is evaluated (polling interval)
Specifying which exceptions should be handled automatically
Changing the total timeout length
Customizing the timeout message
For instance, if the element not interactable error is retried by default, then we can
add an action on a method inside the code getting executed (we just need to
make sure that the code returns true when it is successful):
The easiest way to customize Waits in Java is to use the FluentWait class:
Identifying and working with element objects in the DOM.
The majority of most people’s Selenium code involves working with web elements.
5.1 - File Upload
The file upload dialog could be handled using Selenium,
when the input element is of type file.
An example of it, could be found on this
web page- https://the-internet.herokuapp.com/upload
We will require to have a file available with us,
which we need to upload.
The code to upload the file for different programming
languages will be as follows -
importjava.util.concurrent.TimeUnit;importorg.openqa.selenium.By;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeDriver;classfileUploadDoc{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();driver.manage().timeouts().implicitlyWait(10,TimeUnit.SECONDS);driver.get("https://the-internet.herokuapp.com/upload");//we want to import selenium-snapshot file.
driver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg");driver.findElement(By.id("file-submit")).submit();if(driver.getPageSource().contains("File Uploaded!")){System.out.println("file uploaded");}else{System.out.println("file not uploaded");}driver.quit();}}
fromseleniumimportwebdriverdriver.implicitly_wait(10)driver.get("https://the-internet.herokuapp.com/upload");driver.find_element(By.ID,"file-upload").send_keys("selenium-snapshot.jpg")driver.find_element(By.ID,"file-submit").submit()if(driver.page_source.find("File Uploaded!")):print("file upload success")else:print("file upload not successful")driver.quit()
usingSystem;usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceSeleniumDocumentation.SeleniumPRs{classFileUploadExample{staticvoidMain(String[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://the-internet.herokuapp.com/upload");driver.FindElement(By.Id("file-upload")).SendKeys("selenium-snapshot.jpg");driver.FindElement(By.Id("file-submit")).Submit();if(driver.PageSource.Contains("File Uploaded!")){Console.WriteLine("file uploaded");}else{Console.WriteLine("file not uploaded");}driver.Quit