Selenium is an umbrella project for a range of tools and libraries
that enable and support the automation of web browsers.
It provides extensions to emulate user interaction with browsers,
a distribution server for scaling browser allocation,
and the infrastructure for implementations of the
W3C WebDriver specification
that lets you write interchangeable code for all major web browsers.
This project is made possible by volunteer contributors
who have put in thousands of hours of their own time,
and made the source code
freely available
for anyone to use, enjoy, and improve.
Selenium brings together browser vendors, engineers, and enthusiasts
to further an open discussion around automation of the web platform.
The project organises an annual conference
to teach and nurture the community.
At the core of Selenium is WebDriver,
an interface to write instruction sets that can be run interchangeably in many
browsers. Once you’ve installed everything, only a few lines of code get you inside
a browser. You can find a more comprehensive example in Writing your first Selenium script
See the Overview to check the different project
components and decide if Selenium is the right tool for you.
You should continue on to Getting Started
to understand how you can install Selenium and successfully use it as a test
automation tool, and scaling simple tests like this to run in large, distributed
environments on multiple browsers, on several different operating systems.
1 - Selenium overview
Is Selenium for you? See an overview of the different project components.
Selenium is not just one tool or API
but it composes many tools.
WebDriver
If you are beginning with desktop website or mobile website test automation, then you
are going to be using WebDriver APIs. WebDriver
uses browser automation APIs provided by browser vendors to control browser and
run tests. This is as if a real user is operating the browser. Since
WebDriver does not require its API to be compiled with application
code, it is not intrusive. Hence, you are testing the
same application which you push live.
IDE
IDE (Integrated Development Environment)
is the tool you use to develop your Selenium test cases. It’s an easy-to-use Chrome
and Firefox extension and is generally the most efficient way to develop
test cases. It records the users’ actions in the browser for you, using
existing Selenium commands, with parameters defined by the context of
that element. This is not only a time-saver but also an excellent way
of learning Selenium script syntax.
Grid
Selenium Grid allows you to run test cases in different
machines across different platforms. The control of
triggering the test cases is on the local end, and
when the test cases are triggered, they are automatically
executed by the remote end.
After the development of the WebDriver tests, you may face
the need of running your tests on multiple browser and
operating system combinations.
This is where Grid comes into the picture.
1.1 - Selenium components
Building a test suite using WebDriver will require you to understand and
effectively use a number of different components. As with everything in
software, different people use different terms for the same idea. Below is
a breakdown of how terms are used in this description.
Terminology
API: Application Programming Interface. This is the set of “commands”
you use to manipulate WebDriver.
Library: A code module which contains the APIs and the code necessary
to implement them. Libraries are specific to each language binding, eg .jar
files for Java, .dll files for .NET, etc.
Driver: Responsible for controlling the actual browser. Most drivers
are created by the browser vendors themselves. Drivers are generally
executable modules that run on the system with the browser itself,
not on the system executing the test suite. (Although those may be the
same system.) NOTE: Some people refer to the drivers as proxies.
Framework: An additional library used as a support for WebDriver
suites. These frameworks may be test frameworks such as JUnit or NUnit.
They may also be frameworks supporting natural language features such
as Cucumber or Robotium. Frameworks may also be written and used for
things such as manipulating or configuring the system under test, data
creation, test oracles, etc.
The Parts and Pieces
At its minimum, WebDriver talks to a browser through a driver. Communication
is two way: WebDriver passes commands to the browser through the driver, and
receives information back via the same route.
The driver is specific to the browser, such as ChromeDriver for Google’s
Chrome/Chromium, GeckoDriver for Mozilla’s Firefox, etc. The driver runs on
the same system as the browser. This may, or may not be, the same system where
the tests themselves are executing.
This simple example above is direct communication. Communication to the
browser may also be remote communication through Selenium Server or
RemoteWebDriver. RemoteWebDriver runs on the same system as the driver
and the browser.
Remote communication can also take place using Selenium Server or Selenium
Grid, both of which in turn talk to the driver on the host system
Where Frameworks fit in
WebDriver has one job and one job only: communicate with the browser via any
of the methods above. WebDriver does not know a thing about testing: it does not
know how to compare things, assert pass or fail, and it certainly does not know
a thing about reporting or Given/When/Then grammar.
This is where various frameworks come in to play. At a minimum you will need a
test framework that matches the language bindings, e.g. NUnit for .NET, JUnit
for Java, RSpec for Ruby, etc.
The test framework is responsible for running and executing your WebDriver
and related steps in your tests. As such, you can think of it looking akin
to the following image.
Natural language frameworks/tools such as Cucumber may exist as part of that
Test Framework box in the figure above, or they may wrap the Test Framework
entirely in their own implementation.
1.2 - A deeper look at Selenium
Selenium is an umbrella project for a range of tools and libraries that enable and support the automation of web browsers.
Selenium controls web browsers
Selenium is many things
but at its core, it is a toolset for web browser automation
that uses the best techniques available
to remotely control browser instances
and emulate a user’s interaction with the browser.
It allows users to simulate common activities performed by end-users;
entering text into fields,
selecting drop-down values and checking boxes,
and clicking links in documents.
It also provides many other controls such as mouse movement,
arbitrary JavaScript execution, and much more.
Although used primarily for front-end testing of websites,
Selenium is at its core a browser user agent library.
The interfaces are ubiquitous to their application,
which encourages composition with other libraries to suit your purpose.
One interface to rule them all
One of the project’s guiding principles
is to support a common interface for all (major) browser technologies.
Web browsers are incredibly complex, highly engineered applications,
performing their operations in completely different ways
but which frequently look the same while doing so.
Even though the text is rendered in the same fonts,
the images are displayed in the same place
and the links take you to the same destination.
What is happening underneath is as different as night and day.
Selenium “abstracts” these differences,
hiding their details and intricacies from the person writing the code.
This allows you to write several lines of code to perform a complicated workflow,
but these same lines will execute on Firefox,
Internet Explorer, Chrome, and all other supported browsers.
Tools and support
Selenium’s minimalist design approach gives it the
versatility to be included as a component in bigger applications.
The surrounding infrastructure provided under the Selenium umbrella
gives you the tools to put together
your grid of browsers
so tests can be run on different browsers and multiple operating systems
across a range of machines.
Imagine a bank of computers in your server room or data center
all firing up browsers at the same time
hitting your site’s links, forms,
and tables—testing your application 24 hours a day.
Through the simple programming interface
provided for the most common languages,
these tests will run tirelessly in parallel,
reporting back to you when errors occur.
It is an aim to help make this a reality for you,
by providing users with tools and documentation to not only control browsers
but to make it easy to scale and deploy such grids.
Who uses Selenium
Many of the most important companies in the world
have adopted Selenium for their browser-based testing,
often replacing years-long efforts involving other proprietary tools.
As it has grown in popularity, so have its requirements and challenges multiplied.
As the web becomes more complicated
and new technologies are added to websites,
it’s the mission of this project to keep up with them where possible.
Being an open source project,
this support is provided through the generous donation of time from many volunteers,
every one of which has a “day job”.
Another mission of the project is to encourage
more volunteers to partake in this effort,
and build a strong community
so that the project can continue to keep up with emerging technologies
and remain a dominant platform for functional test automation.
2 - WebDriver
WebDriver drives a browser natively, learn more about it.
WebDriver drives a browser natively, as a user would, either locally
or on a remote machine using the Selenium server,
marks a leap forward in terms of browser automation.
Selenium WebDriver refers to both the language bindings
and the implementations of the individual browser controlling code.
This is commonly referred to as just WebDriver.
WebDriver is designed as a simple
and more concise programming interface.
WebDriver is a compact object-oriented API.
It drives the browser effectively.
2.1 - Getting started
If you are new to Selenium, we have a few resources that can help you get up to speed right away.
Selenium supports automation of all the major browsers in the market
through the use of WebDriver.
WebDriver is an API and protocol that defines a language-neutral interface
for controlling the behaviour of web browsers.
Each browser is backed by a specific WebDriver implementation, called a driver.
The driver is the component responsible for delegating down to the browser,
and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors
take responsibility for the implementation for their browsers.
Selenium makes use of these third party drivers where possible,
but also provides its own drivers maintained by the project
for the cases when this is not a reality.
The Selenium framework ties all of these pieces together
through a user-facing interface that enables the different browser backends
to be used transparently,
enabling cross-browser and cross-platform automation.
Selenium setup is quite different from the setup of other commercial tools.
Before you can start writing Selenium code, you have to
install the language bindings libraries for your language of choice, the browser you
want to use, and the driver for that browser.
Follow the links below to get up and going with Selenium WebDriver.
If you wish to start with a low-code/record and playback tool, please check
Selenium IDE
Once you get things working, if you want to scale up your tests, check out the
Selenium Grid.
2.1.1 - Install a Selenium library
Setting up the Selenium library for your favourite programming language.
First you need to install the Selenium bindings for your automation project.
The installation process for libraries depends on the language you choose to use.
Make sure you check the Selenium downloads page to make sure
you are using the latest version.
Installation of Selenium libraries for Python can be done using pip:
pip install selenium
Alternatively you can download the PyPI source archive
(selenium-x.x.x.tar.gz) and install it using setup.py:
python setup.py install
Installation of Selenium libraries for C# can be done using NuGet in one of two ways
Using a Packet Manager:
Install-Package Selenium.WebDriver
Using .NET CLI
dotnet add package Selenium.WebDriver
Supported .NET Versions
Make sure to use the .NET SDK version compatible with relevant Selenium package.
Check the dependencies section to find out the supported .NET version.
At the time of this update, .NET 5.0 (Visual Studio 2019) is known to be supported, and .NET 6.0 is not supported.
You can download MSBuild Tools 2019 from here to install the needed components and dependencies such as .NET SDK and NuGet Package Manager.
Using Visual Studio Code (vscode) and C#
This is a quick guide to help you get started with vscode and C#, however, more research may be required.
Install the compatible .NET SDK as per the section above.
Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet.
Follow the instruction here
to create and run the “Hello World” console project using C#.
You may also create a NUnit starter project using the command line dotnet new NUnit.
Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues.
If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects.
Add the following section to the file nuget.config if it is empty:
For more info about nuget.configclick here.
You may have to customize nuget.config to meet you needs.
Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver.
Press Enter and select the version.
Now you can use the examples in the documentation related to C# with vscode.
Installation of Selenium libraries for Ruby can be done using gem:
gem install selenium-webdriver
Or add it to your Gemfile:
gem'selenium-webdriver','~> 4.0'
Installation of Selenium libraries for JavaScript can be done using npm:
Setting up your system to allow a browser to be automated.
Through WebDriver, Selenium supports all major browsers on the market
such as Chrome/Chromium, Firefox, Internet Explorer, Edge, Opera, and Safari.
Where possible, WebDriver drives the browser
using the browser’s built-in support for automation.
Since all the driver implementations except for Internet Explorer are provided by the
browser vendors themselves, they are not included in the standard Selenium distribution.
This section explains the basic requirements for getting you started with the different browsers.
Read about more advanced options for starting a driver
in our driver configuration documentation.
Note: The Opera driver does not support w3c syntax, so we recommend using chromedriver to work with Opera.
See the code example for opening an Opera browser.
Three Ways to Use Drivers
1. Driver Management Software
Most machines automatically update the browser, but the driver does not. To make sure you get
the correct driver for your browser, there are many third party libraries to assist you.
This option first requires manually downloading the driver (See Quick Reference Section for links).
This is a flexible option to change location of drivers without having to update your code, and will work
on multiple machines without requiring that each machine put the drivers in the same place.
You can either place the drivers in a directory that is already listed in PATH, or you can place them in a directory
and add it to PATH.
You can test if it has been added correctly by starting the driver:
chromedriver
To see what directories are already on PATH, open a Command Prompt and execute:
echo %PATH%
If the location to your driver is not already in a directory listed,
you can add a new directory to PATH:
setx PATH "%PATH%;C:\WebDriver\bin"
You can test if it has been added correctly by starting the driver:
chromedriver.exe
If your PATH is configured correctly above,
you will see some output relating to the startup of the driver:
Starting ChromeDriver 95.0.4638.54 (d31a821ec901f68d0d34ccdbaea45b4c86ce543e-refs/branch-heads/4638@{#871}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
You can regain control of your command prompt by pressing Ctrl+C
3. Hard Coded Location
Similar to Option 2 above, you need to manually download the driver (See Quick Reference Section for links).
Specifying the location in the code itself has the advantage of not needing to figure out Environment Variables on
your system, but has the drawback of making the code much less flexible.
Typically, browsers are started with specific options that describe
which capabilities the browser must support, and how the browser should
behave during the session. Some capabilities are
shared by all browsers, and
some will be specific to the browser being used.
This page will show examples of starting a browser with the default capabilities.
By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of
the Chrome browser and the version of chromedriver must match the major version.
Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome,
the major version number of edgedriver must match the major version of the Edge browser.
The IE Driver is the only driver maintained by the Selenium Project directly.
While binaries for both the 32-bit and 64-bit
versions of Internet Explorer are available, there are some
limitations
with the 64-bit driver. As such it is recommended to use the 32-bit driver.
Legacy
The Selenium project aims to support the same releases that
Microsoft considers current.
Older releases may work, but will not be supported. Note that Internet Explorer 11 will end support for certain
operating systems, including Windows 10 on June 15, 2022.
It should be noted that as Internet Explorer
preferences are saved against the logged-in user’s account, some additional setup is required.
Additional information about using Internet Explorer can be found
on the Selenium wiki
Since the opera driver does not support w3c syntax, but is based on Chromium, it is recommended
to drive Opera browser with the chromedriver. Like all Chromium implementations,
make sure that the browser version matches the driver version.
Unlike Chromium and Firefox drivers, the safaridriver is installed with the Operating System.
To enable automation on Safari, run the following command from the terminal:
Everything Selenium does is send the browser commands to do something or send requests for information.
Most of what you’ll do with Selenium is a combination of these basic commands:
Synchronizing the code with the current state of the browser is one of the biggest challenges
with Selenium, and doing it well is an advanced topic.
Essentially you want to make sure that the element is on the page before you attempt to locate it
and the element is in an interactable state before you attempt to interact with it.
An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so
we’ll use it as a placeholder.
Elements store a lot of information that can be requested.
Notice that we need to relocate the search box because the DOM has changed since we first located it.
Take what you’ve learned and build out your Selenium code.
As you find more functionality that you need, read up on the rest of our
WebDriver documentation.
2.1.5 - Upgrade to Selenium 4
Are you still using Selenium 3? This guide will help you upgrade to the latest release!
Upgrading to Selenium 4 should be a painless process if you are using one of the officially
supported languages (Ruby, JavaScript, C#, Python, and Java). There might be some cases where
a few issues can happen, and this guide will help you to sort them out. We will go through
the steps to upgrade your project dependencies and understand the major deprecations and
changes the version upgrade brings.
These are the steps we will follow to upgrade to Selenium 4:
Preparing our test code
Upgrading dependencies
Potential errors and deprecation messages
Note: while Selenium 3.x versions were being developed, support for the W3C WebDriver standard
was implemented. Both this new protocol and the legacy JSON Wire Protocol were supported. Around
version 3.11, Selenium code became compliant with the level W3C 1 specification. The W3C compliant
code in the latest version of Selenium 3 will work as expected in Selenium 4.
Preparing our test code
Selenium 4 removes support for the legacy protocol and uses the W3C WebDriver standard by
default under the hood. For most things, this implementation will not affect end users.
The major exceptions are Capabilities and the Actions class.
Capabilities
If the test capabilities are not structured to be W3C compliant, may cause a session to not
be started. Here is the list of W3C WebDriver standard capabilities:
browserName
browserVersion (replaces version)
platformName (replaces platform)
acceptInsecureCerts
pageLoadStrategy
proxy
timeouts
unhandledPromptBehavior
An up-to-date list of standard capabilities can be found at
W3C WebDriver.
Any capability that is not contained in the list above, needs to include a vendor prefix.
This applies to browser specific capabilities as well as cloud vendor specific capabilities.
For example, if your cloud vendor uses build and name capabilities for your tests, you need
to wrap them in a cloud:options block (check with your cloud vendor for the appropriate prefix).
The utility methods to find elements in the Java bindings (FindsBy interfaces) have been removed
as they were meant for internal use only. The following code samples explain this better.
Check the subsections below to install Selenium 4 and have your project dependencies upgraded.
Java
The process of upgrading Selenium depends on which build tool is being used. We will cover the
most common ones for Java, which are Maven and
Gradle. The minimum Java version required is still 8.
Maven
Before
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>3.141.59</version></dependency><!-- more dependencies ... --></dependencies>
After
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>4.0.0</version></dependency><!-- more dependencies ... --></dependencies>
After making the change, you could execute mvn clean compile on the same directory where the
pom.xml file is.
Gradle
Before
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '3.141.59'
}
test {
useJUnitPlatform()
}
After
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.0.0'
}
test {
useJUnitPlatform()
}
After making the change, you could execute ./gradlew clean build
on the same directory where the build.gradle file is.
To check all the Java releases, you can head to MVNRepository.
C#
The place to get updates for Selenium 4 in C# is NuGet. Under the
Selenium.WebDriver package you
can get the instructions to update to the latest version. Inside of Visual Studio, through the
NuGet Package Manager you can execute:
The most important change to use Python is the minimum required version. Selenium 4 will
require a minimum Python 3.7 or higher. More details can be found at the
Python Package Index. To upgrade from the
command line, you can execute:
pip install selenium==4.0.0
Ruby
The update details for Selenium 4 can be seen at the
selenium-webdriver
gem in RubyGems. To install the latest version, you can execute:
gem install selenium-webdriver
To add it to your Gemfile:
gem 'selenium-webdriver', '~> 4.0.0'
JavaScript
The selenium-webdriver package can be found at the Node package manager,
npmjs. Selenium 4 can be found
here. To install it, you
could either execute:
Waits are also expecting different parameters now. WebDriverWait is now expecting a Duration
instead of a long for timeout in seconds and milliseconds. The withTimeout and pollingEvery
utility methods from FluentWait have switched from expecting (long time, TimeUnit unit) to
expect (Duration duration).
Merging capabilities is no longer changing the calling object
It was possible to merge a different set of capabilities into another set, and it was
mutating the calling object. Now, the result of the merge operation needs to be assigned.
The result of the merge call needs to be assigned to an object.
Firefox Legacy
Before GeckoDriver was around, the Selenium project had a driver implementation to automate
Firefox (version <48). However, this implementation is not needed anymore as it does not work
in recent versions of Firefox. To avoid major issues when upgrading to Selenium 4, the setLegacy
option will be shown as deprecated. The recommendation is to stop using the old implementation
and rely only on GeckoDriver. The following code will show the setLegacy line deprecated after
upgrading.
executable_path has been deprecated, please pass in a Service object
In Selenium 4, you’ll need to set the driver’s executable_path from a Service object to prevent deprecation warnings. (Or don’t set the path and instead make sure that the driver you need is on the System PATH.)
We went through the major changes to be taken into consideration when upgrading to Selenium 4.
Covering the different aspects to cover when test code is prepared for the upgrade, including
suggestions on how to prevent potential issues that can show up when using the new version of
Selenium. To finalize, we also covered a set of possible issues that you can bump into after
upgrading, and we shared potential fixes for those issues.
In order to create a new session by Selenium WebDriver,
the local end should provide the basic capabilities to the remote end.
The remote end uses the same set of capabilities to
create a session and describes the current session features.
WebDriver provides capabilities that each remote
end will/should support the implementation.
The following capabilities are supported by WebDriver:
browserName:
This capability is used to set the browserName for a given session.
If the specified browser is not installed at the
remote end, the session creation will fail.
browserVersion:
This capability is optional, this is used to
set the available browser version at remote end.
For Example, if ask for Chrome version 75 on a system that
only has 80 installed, the session creation will fail.
pageLoadStrategy:
When navigating to a new page via URL, by default Selenium will wait
until the Document’s Ready State is “complete.”
The document.readyState property of a document describes the loading state of the current document.
By default, WebDriver will hold off on completing a navigation method (e.g., driver.navigate().get())
until the document ready state is complete. This does not
necessarily mean that the page has finished loading, especially for sites
like Single Page Applications that use a lot of JavaScript to dynamically load content
after the Ready State returns complete. Note also that this behavior does not apply to navigation
that is a result of clicking an element or submitting a form.
If a page takes a long time to load as a result of downloading assets (e.g., images, css, js)
that aren’t important to the automation,
you can change from the default parameter of normal to
eager or none to speed up the session. This value applies to the entire
session, so make sure that your waiting strategy is sufficient
to minimize flakiness.
The page load strategy queries the
document.readyState
as described in the table below:
Strategy
Ready State
Notes
normal
complete
Used by default, waits for all resources to download
eager
interactive
DOM access is ready, but other resources like images may still be loading
none
Any
Does not block WebDriver at all
normal
This will make Selenium WebDriver to wait for the entire page is loaded.
When set to normal, Selenium WebDriver waits until the
load event fire is returned.
By default normal is set to browser if none is provided.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
const{Builder,Capabilities}=require('selenium-webdriver');constcaps=newCapabilities();// Set page load strategy to Normal
caps.setPageLoadStrategy("normal");(asyncfunctionnormalPageLoad(){letdriver=awaitnewBuilder().withCapabilities(caps).forBrowser('chrome').build();try{// Navigate to Url
awaitdriver.get('https://www.google.com');}finally{awaitdriver.quit();}})();
This will make Selenium WebDriver to wait until the
initial HTML document has been completely loaded and parsed,
and discards loading of stylesheets, images and subframes.
When set to eager, Selenium WebDriver waits until
DOMContentLoaded event fire is returned.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
const{Builder,Capabilities}=require('selenium-webdriver');constcaps=newCapabilities();// Set page load strategy to eager
caps.setPageLoadStrategy("eager");(asyncfunctionexample(){letdriver=awaitnewBuilder().withCapabilities(caps).forBrowser('chrome').build();try{// Navigate to Url
awaitdriver.get('https://www.google.com');}finally{awaitdriver.quit();}})();
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
const{Builder,Capabilities}=require('selenium-webdriver');constcaps=newCapabilities();// Set page load strategy to None
caps.setPageLoadStrategy("none");(asyncfunctionnormalPageLoad(){letdriver=awaitnewBuilder().withCapabilities(caps).forBrowser('chrome').build();try{// Navigate to Url
awaitdriver.get('https://www.google.com');}finally{awaitdriver.quit();}})();
This identifies the operating system at the remote-end,
fetching the platformName returns the OS name.
In cloud-based providers,
setting platformName sets the OS at the remote-end.
acceptInsecureCerts
This capability checks whether an expired (or)
invalid TLS Certificate is used while navigating
during a session.
If the capability is set to false, an
insecure certificate error
will be returned as navigation encounters any domain
certificate problems. If set to true, invalid certificate will be
trusted by the browser.
All self-signed certificates will be trusted by this capability by default.
Once set, acceptInsecureCerts capability will have an
effect for the entire session.
timeouts
A WebDriver session is imposed with a certain session timeout
interval, during which the user can control the behaviour
of executing scripts or retrieving information from the browser.
Each session timeout is configured with
combination of different timeouts as described below:
Script Timeout:
Specifies when to interrupt an executing script in
a current browsing context. The default timeout 30,000
is imposed when a new session is created by WebDriver.
Page Load Timeout:
Specifies the time interval in which web page
needs to be loaded in a current browsing context.
The default timeout 300,000 is imposed when a
new session is created by WebDriver. If page load limits
a given/default time frame, the script will be stopped by
TimeoutException.
Implicit Wait Timeout
This specifies the time to wait for the
implicit element location strategy when
locating elements. The default timeout 0
is imposed when a new session is created by WebDriver.
unhandledPromptBehavior
Specifies the state of current session’s user prompt handler.
Defaults to dismiss and notify state
User Prompt Handler
This defines what action must take when a
user prompt encounters at the remote-end. This is defined by
unhandledPromptBehavior capability and has the following states:
dismiss
accept
dismiss and notify
accept and notify
ignore
setWindowRect
This command alters the size and position of the current
browsing context window. This command acts as setter
to getWindowRect command which accepts width, height,
x, y as optional arguments.
During automation, the current browsing context will be associated
with window states, which describe the visibility
of the browser window. The window states are
maximized
minimized
normal
fullscreen
Setting Width or Height does not guaranteed that the resulting
window size will exactly match that which was requested. This is because
some drivers may not be able to resize in single-pixel increments.
Due to this, fetching the window state/details by getWindowRect
may not match the values set in the browser.
strictFileInteractability
This new capability indicates if strict interactability checks
should be applied to input type=file elements. As strict interactability
checks are off by default, there is a change in behaviour
when using Element Send Keys with hidden file upload controls.
proxy
A proxy server acts as an intermediary for
requests between a client and a server. In simple,
the traffic flows through the proxy server
on its way to the address you requested and back.
A proxy server for automation scripts
with Selenium could be helpful for:
Capture network traffic
Mock backend calls made by the website
Access the required website under complex network
topologies or strict corporate restrictions/policies.
If you are in a corporate environment, and a
browser fails to connect to a URL, this is most
likely because the environment needs a proxy to be accessed.
Selenium WebDriver provides a way to proxy settings:
const{Builder}=require("selenium-webdriver");constfirefox=require('selenium-webdriver/firefox');constoptions=newfirefox.Options();letprofile='/path to custom profile';options.setProfile(profile);constdriver=newBuilder().forBrowser('firefox').setFirefoxOptions(options).build();
2.2.4 - Capabilities specific to Internet Explorer browser
These capabilities are specific to Internet Explorer.
fileUploadDialogTimeout
In some environments, Internet Explorer may timeout when opening the
File Upload dialog. IEDriver has a default timeout of 1000ms, but you
can increase the timeout using the fileUploadDialogTimeout capability.
When set to true, this capability clears the Cache,
Browser History and Cookies for all running instances
of InternetExplorer including those started manually
or by the driver. By default, it is set to false.
Using this capability will cause performance drop while
launching the browser, as the driver will wait until the cache
gets cleared before launching the IE browser.
This capability accepts a Boolean value as parameter.
InternetExplorer driver expects the browser zoom level to be 100%,
else the driver will throw an exception. This default behaviour
can be disabled by setting the ignoreZoomSetting to true.
This capability accepts a Boolean value as parameter.
Whether to skip the Protected Mode check while launching
a new IE session.
If not set and Protected Mode settings are not same for
all zones, an exception will be thrown by the driver.
If capability is set to true, tests may
become flaky, unresponsive, or browsers may hang.
However, this is still by far a second-best choice,
and the first choice should always be to actually
set the Protected Mode settings of each zone manually.
If a user is using this property,
only a “best effort” at support will be given.
This capability accepts a Boolean value as parameter.
Internet Explorer includes several command-line options
that enable you to troubleshoot and configure the browser.
The following describes few supported command-line options
-private : Used to start IE in private browsing mode. This works for IE 8 and later versions.
-k : Starts Internet Explorer in kiosk mode.
The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.
-extoff : Starts IE in no add-on mode.
This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.
Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.
2.3.2 - JavaScript alerts, prompts and confirmations
WebDriver provides an API for working with the three types of native
popup messages offered by JavaScript. These popups are styled by the
browser and offer limited customisation.
Alerts
The simplest of these is referred to as an alert, which shows a
custom message, and a single button which dismisses the alert, labelled
in most browsers as OK. It can also be dismissed in most browsers by
pressing the close button, but this will always do the same thing as
the OK button. See an example alert.
WebDriver can get the text from the popup and accept or dismiss these
alerts.
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click();//Wait for the alert to be displayed and store it in a variable
Alertalert=wait.until(ExpectedConditions.alertIsPresent());//Store the alert text in a variable
Stringtext=alert.getText();//Press the OK button
alert.accept();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See an example alert").click()# Wait for the alert to be displayed and store it in a variablealert=wait.until(expected_conditions.alert_is_present())# Store the alert text in a variabletext=alert.text# Press the OK buttonalert.accept()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See an example alert")).Click();//Wait for the alert to be displayed and store it in a variableIAlertalert=wait.Until(ExpectedConditions.AlertIsPresent());//Store the alert text in a variablestringtext=alert.Text;//Press the OK buttonalert.Accept();
# Click the link to activate the alertdriver.find_element(:link_text,'See an example alert').click# Store the alert reference in a variablealert=driver.switch_to.alert# Store the alert text in a variablealert_text=alert.text# Press on OK buttonalert.accept
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See an example alert')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Store the alert text in a variable
letalertText=awaitalert.getText();//Press the OK button
awaitalert.accept();// Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click()//Wait for the alert to be displayed and store it in a variable
valalert=wait.until(ExpectedConditions.alertIsPresent())//Store the alert text in a variable
valtext=alert.getText()//Press the OK button
alert.accept()
Confirm
A confirm box is similar to an alert, except the user can also choose
to cancel the message. See
a sample confirm.
This example also shows a different approach to storing an alert:
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click();//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent());//Store the alert in a variable
Alertalert=driver.switchTo().alert();//Store the alert in a variable for reuse
Stringtext=alert.getText();//Press the Cancel button
alert.dismiss();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See a sample confirm").click()# Wait for the alert to be displayedwait.until(expected_conditions.alert_is_present())# Store the alert in a variable for reusealert=driver.switch_to.alert# Store the alert text in a variabletext=alert.text# Press the Cancel buttonalert.dismiss()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See a sample confirm")).Click();//Wait for the alert to be displayedwait.Until(ExpectedConditions.AlertIsPresent());//Store the alert in a variableIAlertalert=driver.SwitchTo().Alert();//Store the alert in a variable for reusestringtext=alert.Text;//Press the Cancel buttonalert.Dismiss();
# Click the link to activate the alertdriver.find_element(:link_text,'See a sample confirm').click# Store the alert reference in a variablealert=driver.switch_to.alert# Store the alert text in a variablealert_text=alert.text# Press on Cancel buttonalert.dismiss
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See a sample confirm')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Store the alert text in a variable
letalertText=awaitalert.getText();//Press the Cancel button
awaitalert.dismiss();// Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click()//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent())//Store the alert in a variable
valalert=driver.switchTo().alert()//Store the alert in a variable for reuse
valtext=alert.text//Press the Cancel button
alert.dismiss()
Prompt
Prompts are similar to confirm boxes, except they also include a text
input. Similar to working with form elements, you can use WebDriver’s
send keys to fill in a response. This will completely replace the placeholder
text. Pressing the cancel button will not submit any text.
See a sample prompt.
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click();//Wait for the alert to be displayed and store it in a variable
Alertalert=wait.until(ExpectedConditions.alertIsPresent());//Type your message
alert.sendKeys("Selenium");//Press the OK button
alert.accept();
# Click the link to activate the alertdriver.find_element(By.LINK_TEXT,"See a sample prompt").click()# Wait for the alert to be displayedwait.until(expected_conditions.alert_is_present())# Store the alert in a variable for reusealert=Alert(driver)# Type your messagealert.send_keys("Selenium")# Press the OK buttonalert.accept()
//Click the link to activate the alertdriver.FindElement(By.LinkText("See a sample prompt")).Click();//Wait for the alert to be displayed and store it in a variableIAlertalert=wait.Until(ExpectedConditions.AlertIsPresent());//Type your messagealert.SendKeys("Selenium");//Press the OK buttonalert.Accept();
# Click the link to activate the alertdriver.find_element(:link_text,'See a sample prompt').click# Store the alert reference in a variablealert=driver.switch_to.alert# Type a messagealert.send_keys("selenium")# Press on Ok buttonalert.accept
//Click the link to activate the alert
awaitdriver.findElement(By.linkText('See a sample prompt')).click();// Wait for the alert to be displayed
awaitdriver.wait(until.alertIsPresent());// Store the alert in a variable
letalert=awaitdriver.switchTo().alert();//Type your message
awaitalert.sendKeys("Selenium");//Press the OK button
awaitalert.accept();//Note: To use await, the above code should be inside an async function
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click()//Wait for the alert to be displayed and store it in a variable
valalert=wait.until(ExpectedConditions.alertIsPresent())//Type your message
alert.sendKeys("Selenium")//Press the OK button
alert.accept()
2.3.3 - Working with cookies
A cookie is a small piece of data that is sent from a website and stored in your computer.
Cookies are mostly used to recognise the user and load the stored information.
WebDriver API provides a way to interact with cookies with built-in methods:
Add Cookie
It is used to add a cookie to the current browsing context.
Add Cookie only accepts a set of defined serializable JSON object. Here
is the link to the list of accepted JSON key values
First of all, you need to be on the domain that the cookie will be
valid for. If you are trying to preset cookies before
you start interacting with a site and your homepage is large / takes a while to load
an alternative is to find a smaller page on the site (typically the 404 page is small,
e.g. http://example.com/some404page)
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassaddCookie{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");// Adds the cookie into current browser context
driver.manage().addCookie(newCookie("key","value"));}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()driver.get("http://www.example.com")# Adds the cookie into current browser contextdriver.add_cookie({"name":"key","value":"value"})
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceAddCookie{classAddCookie{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");// Adds the cookie into current browser contextdriver.Manage().Cookies.AddCookie(newCookie("key","value"));}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'# Adds the cookie into current browser contextdriver.manage.add_cookie(name:"key",value:"value")ensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=newBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');// set a cookie on the current domain
awaitdriver.manage().addCookie({name:'key',value:'value'});awaitdriver.quit();})();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")// Adds the cookie into current browser context
driver.manage().addCookie(Cookie("key","value"))}finally{driver.quit()}}
Get Named Cookie
It returns the serialized cookie data matching with the cookie name among all associated cookies.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassgetCookieNamed{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("foo","bar"));// Get cookie details with named cookie 'foo'
Cookiecookie1=driver.manage().getCookieNamed("foo");System.out.println(cookie1);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")# Adds the cookie into current browser contextdriver.add_cookie({"name":"foo","value":"bar"})# Get cookie details with named cookie 'foo'print(driver.get_cookie("foo"))
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceGetCookieNamed{classGetCookieNamed{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("foo","bar"));// Get cookie details with named cookie 'foo'varcookie=driver.Manage().Cookies.GetCookieNamed("foo");System.Console.WriteLine(cookie);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"foo",value:"bar")# Get cookie details with named cookie 'foo'putsdriver.manage.cookie_named('foo')ensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctiongetCookieByName(){letdriver=newBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');// set a cookie on the current domain
awaitdriver.manage().addCookie({name:'foo',value:'bar'});// Get cookie details with named cookie 'foo'
awaitdriver.manage().getCookie('foo').then(function(cookie){console.log('cookie details => ',cookie);});awaitdriver.quit();})();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("foo","bar"))// Get cookie details with named cookie 'foo'
valcookie=driver.manage().getCookieNamed("foo")println(cookie)}finally{driver.quit()}}
Get All Cookies
It returns a ‘successful serialized cookie data’ for current browsing context.
If browser is no longer available it returns error.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;importjava.util.Set;publicclassgetAllCookies{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");// Add few cookies
driver.manage().addCookie(newCookie("test1","cookie1"));driver.manage().addCookie(newCookie("test2","cookie2"));// Get All available cookies
Set<Cookie>cookies=driver.manage().getCookies();System.out.println(cookies);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Get all available cookiesprint(driver.get_cookies())
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceGetAllCookies{classGetAllCookies{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));driver.Manage().Cookies.AddCookie(newCookie("test2","cookie2"));// Get All available cookiesvarcookies=driver.Manage().Cookies.AllCookies;}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# Get all available cookiesputsdriver.manage.all_cookiesensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctiongetAllCookies(){letdriver=newBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Get all Available cookies
awaitdriver.manage().getCookies().then(function(cookies){console.log('cookie details => ',cookies);});awaitdriver.quit()})();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))driver.manage().addCookie(Cookie("test2","cookie2"))// Get All available cookies
valcookies=driver.manage().cookiesprintln(cookies)}finally{driver.quit()}}
Delete Cookie
It deletes the cookie data matching with the provided cookie name.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassdeleteCookie{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("test1","cookie1"));Cookiecookie1=newCookie("test2","cookie2");driver.manage().addCookie(cookie1);// delete a cookie with name 'test1'
driver.manage().deleteCookieNamed("test1");/*
Selenium Java bindings also provides a way to delete
cookie by passing cookie object of current browsing context
*/driver.manage().deleteCookie(cookie1);}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Delete a cookie with name 'test1'driver.delete_cookie("test1")
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceDeleteCookie{classDeleteCookie{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));varcookie=newCookie("test2","cookie2");driver.Manage().Cookies.AddCookie(cookie);// delete a cookie with name 'test1' driver.Manage().Cookies.DeleteCookieNamed("test1");// Selenium .net bindings also provides a way to delete// cookie by passing cookie object of current browsing contextdriver.Manage().Cookies.DeleteCookie(cookie);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# delete a cookie with name 'test1'driver.manage.delete_cookie('test1')ensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctiondeleteCookie(){letdriver=newBuilder().forBrowser('chrome').build();try{awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Delete a cookie with name 'test1'
awaitdriver.manage().deleteCookie('test1');// Get all Available cookies
awaitdriver.manage().getCookies().then(function(cookies){console.log('cookie details => ',cookies);});awaitdriver.quit();}catch(e){console.log(e)}})();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))valcookie1=Cookie("test2","cookie2")driver.manage().addCookie(cookie1)// delete a cookie with name 'test1'
driver.manage().deleteCookieNamed("test1")// delete cookie by passing cookie object of current browsing context.
driver.manage().deleteCookie(cookie1)}finally{driver.quit()}}
Delete All Cookies
It deletes all the cookies of the current browsing context.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassdeleteAllCookies{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.example.com");driver.manage().addCookie(newCookie("test1","cookie1"));driver.manage().addCookie(newCookie("test2","cookie2"));// deletes all cookies
driver.manage().deleteAllCookies();}finally{driver.quit();}}}
fromseleniumimportwebdriverdriver=webdriver.Chrome()# Navigate to urldriver.get("http://www.example.com")driver.add_cookie({"name":"test1","value":"cookie1"})driver.add_cookie({"name":"test2","value":"cookie2"})# Deletes all cookiesdriver.delete_all_cookies()
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceDeleteAllCookies{classDeleteAllCookies{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");driver.Manage().Cookies.AddCookie(newCookie("test1","cookie1"));driver.Manage().Cookies.AddCookie(newCookie("test2","cookie2"));// deletes all cookiesdriver.Manage().Cookies.DeleteAllCookies();}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'driver.manage.add_cookie(name:"test1",value:"cookie1")driver.manage.add_cookie(name:"test2",value:"cookie2")# deletes all cookiesdriver.manage.delete_all_cookiesensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=newBuilder().forBrowser('chrome').build();try{awaitdriver.get('https://www.example.com');// Add few cookies
awaitdriver.manage().addCookie({name:'test1',value:'cookie1'});awaitdriver.manage().addCookie({name:'test2',value:'cookie2'});// Delete all cookies
awaitdriver.manage().deleteAllCookies();awaitdriver.quit();}catch(e){console.log(e)}})();
importorg.openqa.selenium.Cookieimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")driver.manage().addCookie(Cookie("test1","cookie1"))driver.manage().addCookie(Cookie("test2","cookie2"))// deletes all cookies
driver.manage().deleteAllCookies()}finally{driver.quit()}}
Same-Site Cookie Attribute
It allows a user to instruct browsers to control whether cookies
are sent along with the request initiated by third party sites.
It is introduced to prevent CSRF (Cross-Site Request Forgery) attacks.
Same-Site cookie attribute accepts two parameters as instructions
Strict:
When the sameSite attribute is set as Strict,
the cookie will not be sent along with
requests initiated by third party websites.
Lax:
When you set a cookie sameSite attribute to Lax,
the cookie will be sent along with the GET
request initiated by third party website.
Note: As of now this feature is landed in chrome(80+version),
Firefox(79+version) and works with Selenium 4 and later versions.
fromseleniumimportwebdriverdriver=webdriver.Chrome()driver.get("http://www.example.com")# Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'driver.add_cookie({"name":"foo","value":"value",'sameSite':'Strict'})driver.add_cookie({"name":"foo1","value":"value",'sameSite':'Lax'})cookie1=driver.get_cookie('foo')cookie2=driver.get_cookie('foo1')print(cookie1)print(cookie2)
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.example.com'# Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'driver.manage.add_cookie(name:"foo",value:"bar",same_site:"Strict")driver.manage.add_cookie(name:"foo1",value:"bar",same_site:"Lax")putsdriver.manage.cookie_named('foo')putsdriver.manage.cookie_named('foo1')ensuredriver.quitend
const{Builder}=require('selenium-webdriver');(asyncfunctionsameSiteCookieExample(){letdriver=newBuilder().forBrowser('chrome').build();try{awaitdriver.get('https://www.example.com');// set a cookie on the current domain with sameSite 'Strict' (or) 'Lax'
awaitdriver.manage().addCookie({name:'key',value:'value',sameSite:'Strict'});awaitdriver.manage().addCookie({name:'key',value:'value',sameSite:'Lax'});awaitdriver.quit();}catch(e){console.log(e);}})();
Frames are a now deprecated means of building a site layout from
multiple documents on the same domain. You are unlikely to work with
them unless you are working with an pre HTML5 webapp. Iframes allow
the insertion of a document from an entirely different domain, and are
still commonly used.
If you need to work with frames or iframes, WebDriver allows you to
work with them in the same way. Consider a button within an iframe.
If we inspect the element using the browser development tools, we might
see the following:
# This won't workdriver.find_element(:tag_name,'button').click
// This won't work
awaitdriver.findElement(By.css('button')).click();
//This won't work
driver.findElement(By.tagName("button")).click()
However, if there are no buttons outside of the iframe, you might
instead get a no such element error. This happens because Selenium is
only aware of the elements in the top level document. To interact with
the button, we will need to first switch to the frame, in a similar way
to how we switch windows. WebDriver offers three ways of switching to
a frame.
Using a WebElement
Switching using a WebElement is the most flexible option. You can
find the frame using your preferred selector and switch to it.
//Store the web element
WebElementiframe=driver.findElement(By.cssSelector("#modal>iframe"));//Switch to the frame
driver.switchTo().frame(iframe);//Now we can click the button
driver.findElement(By.tagName("button")).click();
# Store iframe web elementiframe=driver.find_element(By.CSS_SELECTOR,"#modal > iframe")# switch to selected iframedriver.switch_to.frame(iframe)# Now click on buttondriver.find_element(By.TAG_NAME,'button').click()
//Store the web elementIWebElementiframe=driver.FindElement(By.CssSelector("#modal>iframe"));//Switch to the framedriver.SwitchTo().Frame(iframe);//Now we can click the buttondriver.FindElement(By.TagName("button")).Click();
# Store iframe web elementiframe=driver.find_element(:css,'#modal > iframe')# Switch to the framedriver.switch_to.frameiframe# Now, Click on the buttondriver.find_element(:tag_name,'button').click
// Store the web element
constiframe=driver.findElement(By.css('#modal > iframe'));// Switch to the frame
awaitdriver.switchTo().frame(iframe);// Now we can click the button
awaitdriver.findElement(By.css('button')).click();
//Store the web element
valiframe=driver.findElement(By.cssSelector("#modal>iframe"))//Switch to the frame
driver.switchTo().frame(iframe)//Now we can click the button
driver.findElement(By.tagName("button")).click()
Using a name or ID
If your frame or iframe has an id or name attribute, this can be used
instead. If the name or ID is not unique on the page, then the first
one found will be switched to.
//Using the ID
driver.switchTo().frame("buttonframe");//Or using the name instead
driver.switchTo().frame("myframe");//Now we can click the button
driver.findElement(By.tagName("button")).click();
# Switch frame by iddriver.switch_to.frame('buttonframe')# Now, Click on the buttondriver.find_element(By.TAG_NAME,'button').click()
//Using the IDdriver.SwitchTo().Frame("buttonframe");//Or using the name insteaddriver.SwitchTo().Frame("myframe");//Now we can click the buttondriver.FindElement(By.TagName("button")).Click();
# Switch by IDdriver.switch_to.frame'buttonframe'# Now, Click on the buttondriver.find_element(:tag_name,'button').click
// Using the ID
awaitdriver.switchTo().frame('buttonframe');// Or using the name instead
awaitdriver.switchTo().frame('myframe');// Now we can click the button
awaitdriver.findElement(By.css('button')).click();
//Using the ID
driver.switchTo().frame("buttonframe")//Or using the name instead
driver.switchTo().frame("myframe")//Now we can click the button
driver.findElement(By.tagName("button")).click()
Using an index
It is also possible to use the index of the frame, such as can be
queried using window.frames in JavaScript.
// Switches to the second frame
driver.switchTo().frame(1);
# Switch to the second framedriver.switch_to.frame(1)
// Switches to the second framedriver.SwitchTo().Frame(1);
# switching to second iframe based on indexiframe=driver.find_elements_by_tag_name('iframe')[1]# switch to selected iframedriver.switch_to.frame(iframe)
// Switches to the second frame
awaitdriver.switchTo().frame(1);
// Switches to the second frame
driver.switchTo().frame(1)
Leaving a frame
To leave an iframe or frameset, switch back to the default content
like so:
// Return to the top level
driver.switchTo().defaultContent();
# switch back to default contentdriver.switch_to.default_content()
// Return to the top leveldriver.SwitchTo().DefaultContent();
# Return to the top leveldriver.switch_to.default_content
// Return to the top level
awaitdriver.switchTo().defaultContent();
// Return to the top level
driver.switchTo().defaultContent()
2.3.5 - Working with windows and tabs
Windows and tabs
Get window handle
WebDriver does not make the distinction between windows and tabs. If
your site opens a new tab or window, Selenium will let you work with it
using a window handle. Each window has a unique identifier which remains
persistent in a single session. You can get the window handle of the
current window by using:
Clicking a link which opens in a
new window
will focus the new window or tab on screen, but WebDriver will not know which
window the Operating System considers active. To work with the new window
you will need to switch to it. If you have only two tabs or windows open,
and you know which window you start with, by the process of elimination
you can loop over both windows or tabs that WebDriver can see, and switch
to the one which is not the original.
However, Selenium 4 provides a new api NewWindow
which creates a new tab (or) new window and automatically switches to it.
//Store the ID of the original window
StringoriginalWindow=driver.getWindowHandle();//Check we don't have other windows open already
assertdriver.getWindowHandles().size()==1;//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click();//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2));//Loop through until we find a new window handle
for(StringwindowHandle:driver.getWindowHandles()){if(!originalWindow.contentEquals(windowHandle)){driver.switchTo().window(windowHandle);break;}}//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"));
fromseleniumimportwebdriverfromselenium.webdriver.support.uiimportWebDriverWaitfromselenium.webdriver.supportimportexpected_conditionsasECwithwebdriver.Firefox()asdriver:# Open URLdriver.get("https://seleniumhq.github.io")# Setup wait for laterwait=WebDriverWait(driver,10)# Store the ID of the original windoworiginal_window=driver.current_window_handle# Check we don't have other windows open alreadyassertlen(driver.window_handles)==1# Click the link which opens in a new windowdriver.find_element(By.LINK_TEXT,"new window").click()# Wait for the new window or tabwait.until(EC.number_of_windows_to_be(2))# Loop through until we find a new window handleforwindow_handleindriver.window_handles:ifwindow_handle!=original_window:driver.switch_to.window(window_handle)break# Wait for the new tab to finish loading contentwait.until(EC.title_is("SeleniumHQ Browser Automation"))
//Store the ID of the original windowstringoriginalWindow=driver.CurrentWindowHandle;//Check we don't have other windows open alreadyAssert.AreEqual(driver.WindowHandles.Count,1);//Click the link which opens in a new windowdriver.FindElement(By.LinkText("new window")).Click();//Wait for the new window or tabwait.Until(wd=>wd.WindowHandles.Count==2);//Loop through until we find a new window handleforeach(stringwindowindriver.WindowHandles){if(originalWindow!=window){driver.SwitchTo().Window(window);break;}}//Wait for the new tab to finish loading contentwait.Until(wd=>wd.Title=="Selenium documentation");
# Store the ID of the original windoworiginal_window=driver.window_handle# Check we don't have other windows open alreadyassert(driver.window_handles.length==1,'Expected one window')# Click the link which opens in a new windowdriver.find_element(link:'new window').click# Wait for the new window or tabwait.until{driver.window_handles.length==2}#Loop through until we find a new window handledriver.window_handles.eachdo|handle|ifhandle!=original_windowdriver.switch_to.windowhandlebreakendend#Wait for the new tab to finish loading contentwait.until{driver.title=='Selenium documentation'}
//Store the ID of the original window
constoriginalWindow=awaitdriver.getWindowHandle();//Check we don't have other windows open already
assert((awaitdriver.getAllWindowHandles()).length===1);//Click the link which opens in a new window
awaitdriver.findElement(By.linkText('new window')).click();//Wait for the new window or tab
awaitdriver.wait(async()=>(awaitdriver.getAllWindowHandles()).length===2,10000);//Loop through until we find a new window handle
constwindows=awaitdriver.getAllWindowHandles();windows.forEach(asynchandle=>{if(handle!==originalWindow){awaitdriver.switchTo().window(handle);}});//Wait for the new tab to finish loading content
awaitdriver.wait(until.titleIs('Selenium documentation'),10000);
//Store the ID of the original window
valoriginalWindow=driver.getWindowHandle()//Check we don't have other windows open already
assert(driver.getWindowHandles().size()===1)//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click()//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2))//Loop through until we find a new window handle
for(windowHandleindriver.getWindowHandles()){if(!originalWindow.contentEquals(windowHandle)){driver.switchTo().window(windowHandle)break}}//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"))
Create new window (or) new tab and switch
Creates a new window (or) tab and will focus the new window or tab on screen.
You don’t need to switch to work with the new window (or) tab. If you have more than two windows
(or) tabs opened other than the new window, you can loop over both windows or tabs that WebDriver can see,
and switch to the one which is not the original.
Note: This feature works with Selenium 4 and later versions.
// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB);// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW);
# Opens a new tab and switches to new tabdriver.switch_to.new_window('tab')# Opens a new window and switches to new windowdriver.switch_to.new_window('window')
// Opens a new tab and switches to new tabdriver.SwitchTo().NewWindow(WindowType.Tab)// Opens a new window and switches to new windowdriver.SwitchTo().NewWindow(WindowType.Window)
# Note: The new_window in ruby only opens a new tab (or) Window and will not switch automatically# The user has to switch to new tab (or) new window# Opens a new tab and switches to new tabdriver.manage.new_window(:tab)# Opens a new window and switches to new windowdriver.manage.new_window(:window)
// Opens a new tab and switches to new tab
awaitdriver.switchTo().newWindow('tab');// Opens a new window and switches to new window
awaitdriver.switchTo().newWindow('window');
// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB)// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW)
Closing a window or tab
When you are finished with a window or tab and it is not the
last window or tab open in your browser, you should close it and switch
back to the window you were using previously. Assuming you followed the
code sample in the previous section you will have the previous window
handle stored in a variable. Put this together and you will get:
//Close the tab or window
driver.close();//Switch back to the old tab or window
driver.switchTo().window(originalWindow);
#Close the tab or windowdriver.close()#Switch back to the old tab or windowdriver.switch_to.window(original_window)
//Close the tab or windowdriver.Close();//Switch back to the old tab or windowdriver.SwitchTo().Window(originalWindow);
#Close the tab or windowdriver.close#Switch back to the old tab or windowdriver.switch_to.windoworiginal_window
//Close the tab or window
awaitdriver.close();//Switch back to the old tab or window
awaitdriver.switchTo().window(originalWindow);
//Close the tab or window
driver.close()//Switch back to the old tab or window
driver.switchTo().window(originalWindow)
Forgetting to switch back to another window handle after closing a
window will leave WebDriver executing on the now closed page, and will
trigger a No Such Window Exception. You must switch
back to a valid window handle in order to continue execution.
Quitting the browser at the end of a session
When you are finished with the browser session you should call quit,
instead of close:
/**
* Example using JUnit
* https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
*/@AfterAllpublicstaticvoidtearDown(){driver.quit();}
/*
Example using Visual Studio's UnitTesting
https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.unittesting.aspx
*/[TestCleanup]publicvoidTearDown(){driver.Quit();}
/**
* Example using Mocha
* https://mochajs.org/#hooks
*/after('Tear down',asyncfunction(){awaitdriver.quit();});
/**
* Example using JUnit
* https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
*/@AfterAllfuntearDown(){driver.quit()}
If not running WebDriver in a test context, you may consider using
try / finally which is offered by most languages so that an exception
will still clean up the WebDriver session.
Python’s WebDriver now supports the python context manager,
which when using the with keyword can automatically quit the driver at
the end of execution.
withwebdriver.Firefox()asdriver:# WebDriver code here...# WebDriver will automatically quit after indentation
Window management
Screen resolution can impact how your web application renders, so
WebDriver provides mechanisms for moving and resizing the browser
window.
//Access each dimension individually
intwidth=driver.manage().window().getSize().getWidth();intheight=driver.manage().window().getSize().getHeight();//Or store the dimensions and query them later
Dimensionsize=driver.manage().window().getSize();intwidth1=