This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

The Selenium Browser Automation Project

Selenium is an umbrella project for a range of tools and libraries that enable and support the automation of web browsers.

It provides extensions to emulate user interaction with browsers, a distribution server for scaling browser allocation, and the infrastructure for implementations of the W3C WebDriver specification that lets you write interchangeable code for all major web browsers.

This project is made possible by volunteer contributors who have put in thousands of hours of their own time, and made the source code freely available for anyone to use, enjoy, and improve.

Selenium brings together browser vendors, engineers, and enthusiasts to further an open discussion around automation of the web platform. The project organises an annual conference to teach and nurture the community.

At the core of Selenium is WebDriver, an interface to write instruction sets that can be run interchangeably in many browsers. Once you’ve installed everything, only a few lines of code get you inside a browser. You can find a more comprehensive example in Writing your first Selenium script

package dev.selenium.hello;

import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;

public class HelloSelenium {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();

        driver.get("https://selenium.dev");

        driver.quit();
    }
}
from selenium import webdriver


driver = webdriver.Chrome()

driver.get("http://selenium.dev")

driver.quit()
using OpenQA.Selenium.Chrome;

namespace SeleniumDocs.Hello
{
    public class HelloSelenium
    {
        public static void Main()
        {
            var driver = new ChromeDriver();
            
            driver.Navigate().GoToUrl("https://selenium.dev");
            
            driver.Quit();
        }
    }
}
require 'selenium-webdriver'

driver = Selenium::WebDriver.for :chrome

driver.get 'https://selenium.dev'

driver.quit
const {Builder} = require('selenium-webdriver');

(async function helloSelenium() {
    let driver = await new Builder().forBrowser('chrome').build();

    await driver.get('https://selenium.dev');

    await driver.quit();
})();
package dev.selenium.hello

import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()

    driver.get("https://selenium.dev")

    driver.quit()
}

See the Overview to check the different project components and decide if Selenium is the right tool for you.

You should continue on to Getting Started to understand how you can install Selenium and successfully use it as a test automation tool, and scaling simple tests like this to run in large, distributed environments on multiple browsers, on several different operating systems.

1 - Selenium overview

Is Selenium for you? See an overview of the different project components.

Selenium is not just one tool or API but it composes many tools.

WebDriver

If you are beginning with desktop website or mobile website test automation, then you are going to be using WebDriver APIs. WebDriver uses browser automation APIs provided by browser vendors to control browser and run tests. This is as if a real user is operating the browser. Since WebDriver does not require its API to be compiled with application code, it is not intrusive. Hence, you are testing the same application which you push live.

IDE

IDE (Integrated Development Environment) is the tool you use to develop your Selenium test cases. It’s an easy-to-use Chrome and Firefox extension and is generally the most efficient way to develop test cases. It records the users’ actions in the browser for you, using existing Selenium commands, with parameters defined by the context of that element. This is not only a time-saver but also an excellent way of learning Selenium script syntax.

Grid

Selenium Grid allows you to run test cases in different machines across different platforms. The control of triggering the test cases is on the local end, and when the test cases are triggered, they are automatically executed by the remote end.

After the development of the WebDriver tests, you may face the need of running your tests on multiple browser and operating system combinations. This is where Grid comes into the picture.

1.1 - Selenium components

Building a test suite using WebDriver will require you to understand and effectively use a number of different components. As with everything in software, different people use different terms for the same idea. Below is a breakdown of how terms are used in this description.

Terminology

  • API: Application Programming Interface. This is the set of “commands” you use to manipulate WebDriver.
  • Library: A code module which contains the APIs and the code necessary to implement them. Libraries are specific to each language binding, eg .jar files for Java, .dll files for .NET, etc.
  • Driver: Responsible for controlling the actual browser. Most drivers are created by the browser vendors themselves. Drivers are generally executable modules that run on the system with the browser itself, not on the system executing the test suite. (Although those may be the same system.) NOTE: Some people refer to the drivers as proxies.
  • Framework: An additional library used as a support for WebDriver suites. These frameworks may be test frameworks such as JUnit or NUnit. They may also be frameworks supporting natural language features such as Cucumber or Robotium. Frameworks may also be written and used for things such as manipulating or configuring the system under test, data creation, test oracles, etc.

The Parts and Pieces

At its minimum, WebDriver talks to a browser through a driver. Communication is two way: WebDriver passes commands to the browser through the driver, and receives information back via the same route.

Basic Communication

The driver is specific to the browser, such as ChromeDriver for Google’s Chrome/Chromium, GeckoDriver for Mozilla’s Firefox, etc. The driver runs on the same system as the browser. This may, or may not be, the same system where the tests themselves are executing.

This simple example above is direct communication. Communication to the browser may also be remote communication through Selenium Server or RemoteWebDriver. RemoteWebDriver runs on the same system as the driver and the browser.

Remote Communication

Remote communication can also take place using Selenium Server or Selenium Grid, both of which in turn talk to the driver on the host system

Remote Communication with Grid

Where Frameworks fit in

WebDriver has one job and one job only: communicate with the browser via any of the methods above. WebDriver does not know a thing about testing: it does not know how to compare things, assert pass or fail, and it certainly does not know a thing about reporting or Given/When/Then grammar.

This is where various frameworks come in to play. At a minimum you will need a test framework that matches the language bindings, e.g. NUnit for .NET, JUnit for Java, RSpec for Ruby, etc.

The test framework is responsible for running and executing your WebDriver and related steps in your tests. As such, you can think of it looking akin to the following image.

Test Framework

Natural language frameworks/tools such as Cucumber may exist as part of that Test Framework box in the figure above, or they may wrap the Test Framework entirely in their own implementation.

1.2 - A deeper look at Selenium

Selenium is an umbrella project for a range of tools and libraries that enable and support the automation of web browsers.

Selenium controls web browsers

Selenium is many things but at its core, it is a toolset for web browser automation that uses the best techniques available to remotely control browser instances and emulate a user’s interaction with the browser.

It allows users to simulate common activities performed by end-users; entering text into fields, selecting drop-down values and checking boxes, and clicking links in documents. It also provides many other controls such as mouse movement, arbitrary JavaScript execution, and much more.

Although used primarily for front-end testing of websites, Selenium is at its core a browser user agent library. The interfaces are ubiquitous to their application, which encourages composition with other libraries to suit your purpose.

One interface to rule them all

One of the project’s guiding principles is to support a common interface for all (major) browser technologies. Web browsers are incredibly complex, highly engineered applications, performing their operations in completely different ways but which frequently look the same while doing so. Even though the text is rendered in the same fonts, the images are displayed in the same place and the links take you to the same destination. What is happening underneath is as different as night and day. Selenium “abstracts” these differences, hiding their details and intricacies from the person writing the code. This allows you to write several lines of code to perform a complicated workflow, but these same lines will execute on Firefox, Internet Explorer, Chrome, and all other supported browsers.

Tools and support

Selenium’s minimalist design approach gives it the versatility to be included as a component in bigger applications. The surrounding infrastructure provided under the Selenium umbrella gives you the tools to put together your grid of browsers so tests can be run on different browsers and multiple operating systems across a range of machines.

Imagine a bank of computers in your server room or data center all firing up browsers at the same time hitting your site’s links, forms, and tables—testing your application 24 hours a day. Through the simple programming interface provided for the most common languages, these tests will run tirelessly in parallel, reporting back to you when errors occur.

It is an aim to help make this a reality for you, by providing users with tools and documentation to not only control browsers but to make it easy to scale and deploy such grids.

Who uses Selenium

Many of the most important companies in the world have adopted Selenium for their browser-based testing, often replacing years-long efforts involving other proprietary tools. As it has grown in popularity, so have its requirements and challenges multiplied.

As the web becomes more complicated and new technologies are added to websites, it’s the mission of this project to keep up with them where possible. Being an open source project, this support is provided through the generous donation of time from many volunteers, every one of which has a “day job”.

Another mission of the project is to encourage more volunteers to partake in this effort, and build a strong community so that the project can continue to keep up with emerging technologies and remain a dominant platform for functional test automation.

2 - WebDriver

WebDriver drives a browser natively, learn more about it.

WebDriver drives a browser natively, as a user would, either locally or on a remote machine using the Selenium server, marks a leap forward in terms of browser automation.

Selenium WebDriver refers to both the language bindings and the implementations of the individual browser controlling code. This is commonly referred to as just WebDriver.

Selenium WebDriver is a W3C Recommendation

  • WebDriver is designed as a simple and more concise programming interface.

  • WebDriver is a compact object-oriented API.

  • It drives the browser effectively.

2.1 - Getting started

If you are new to Selenium, we have a few resources that can help you get up to speed right away.

Selenium supports automation of all the major browsers in the market through the use of WebDriver. WebDriver is an API and protocol that defines a language-neutral interface for controlling the behaviour of web browsers. Each browser is backed by a specific WebDriver implementation, called a driver. The driver is the component responsible for delegating down to the browser, and handles communication to and from Selenium and the browser.

This separation is part of a conscious effort to have browser vendors take responsibility for the implementation for their browsers. Selenium makes use of these third party drivers where possible, but also provides its own drivers maintained by the project for the cases when this is not a reality.

The Selenium framework ties all of these pieces together through a user-facing interface that enables the different browser backends to be used transparently, enabling cross-browser and cross-platform automation.

Selenium setup is quite different from the setup of other commercial tools. Before you can start writing Selenium code, you have to install the language bindings libraries for your language of choice, the browser you want to use, and the driver for that browser.

Follow the links below to get up and going with Selenium WebDriver.

If you wish to start with a low-code/record and playback tool, please check Selenium IDE

Once you get things working, if you want to scale up your tests, check out the Selenium Grid.

2.1.1 - Install a Selenium library

Setting up the Selenium library for your favourite programming language.

First you need to install the Selenium bindings for your automation project. The installation process for libraries depends on the language you choose to use. Make sure you check the Selenium downloads page to make sure you are using the latest version.

Requirements by language

Installation of Selenium libraries for Java is accomplished using a build tool. You can see all available versions on Maven Repository

For Maven, add the selenium-java dependency in your project pom.xml file:

<dependency>
  <groupId>org.seleniumhq.selenium</groupId>
  <artifactId>selenium-java</artifactId>
  <version>4.0.0</version>
</dependency>

For Gradle, add the selenium-java dependency in your project build.gradle file:

dependencies {
    compile group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.0.0'

Installation of Selenium libraries for Python can be done using pip:

pip install selenium

Alternatively you can download the PyPI source archive (selenium-x.x.x.tar.gz) and install it using setup.py:

python setup.py install

Installation of Selenium libraries for C# can be done using NuGet in one of two ways

  • Using a Packet Manager:
Install-Package Selenium.WebDriver
  • Using .NET CLI
dotnet add package Selenium.WebDriver

Supported .NET Versions

Make sure to use the .NET SDK version compatible with relevant Selenium package. Check the dependencies section to find out the supported .NET version. At the time of this update, .NET 5.0 (Visual Studio 2019) is known to be supported, and .NET 6.0 is not supported. You can download MSBuild Tools 2019 from here to install the needed components and dependencies such as .NET SDK and NuGet Package Manager.

Using Visual Studio Code (vscode) and C#

This is a quick guide to help you get started with vscode and C#, however, more research may be required. Install the compatible .NET SDK as per the section above. Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet. Follow the instruction here to create and run the “Hello World” console project using C#. You may also create a NUnit starter project using the command line dotnet new NUnit. Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues. If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects. Add the following section to the file nuget.config if it is empty:

<configuration>
  <packageSources>
    <add key="nuget.org" value="https://api.nuget.org/v3/index.json" protocolVersion="3" />
    <add key="nuget.org" value="https://www.nuget.org/api/v2/" />   
  </packageSources>
...

For more info about nuget.config click here. You may have to customize nuget.config to meet you needs.

Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver. Press Enter and select the version. Now you can use the examples in the documentation related to C# with vscode.

Installation of Selenium libraries for Ruby can be done using gem:

gem install selenium-webdriver

Or add it to your Gemfile:

gem 'selenium-webdriver', '~> 4.0'

Installation of Selenium libraries for JavaScript can be done using npm:

npm install selenium-webdriver
Use the Java bindings for Kotlin.

Next Step

Install the browser drivers

2.1.2 - Install browser drivers

Setting up your system to allow a browser to be automated.

Through WebDriver, Selenium supports all major browsers on the market such as Chrome/Chromium, Firefox, Internet Explorer, Edge, Opera, and Safari. Where possible, WebDriver drives the browser using the browser’s built-in support for automation.

Since all the driver implementations except for Internet Explorer are provided by the browser vendors themselves, they are not included in the standard Selenium distribution. This section explains the basic requirements for getting you started with the different browsers.

Read about more advanced options for starting a driver in our driver configuration documentation.

Quick Reference

BrowserSupported OSMaintained byDownloadIssue Tracker
Chromium/ChromeWindows/macOS/LinuxGoogleDownloadsIssues
FirefoxWindows/macOS/LinuxMozillaDownloadsIssues
EdgeWindows/macOSMicrosoftDownloadsIssues
Internet ExplorerWindowsSelenium ProjectDownloadsIssues
SafarimacOS High Sierra and newerAppleBuilt inIssues

Note: The Opera driver does not support w3c syntax, so we recommend using chromedriver to work with Opera. See the code example for opening an Opera browser.

Three Ways to Use Drivers

1. Driver Management Software

Most machines automatically update the browser, but the driver does not. To make sure you get the correct driver for your browser, there are many third party libraries to assist you.

  1. Import WebDriverManager
import io.github.bonigarcia.wdm.WebDriverManager;
  1. Calling setup() automatically puts the correct browser driver where the code will see it:
WebDriverManager.chromedriver().setup();
  1. Just initialize the driver as you normally would:
ChromeDriver driver = new ChromeDriver();
  1. Import WebDriver Manager for Python
from webdriver_manager.chrome import ChromeDriverManager
  1. Use install() to get the location used by the manager and pass it into service class
service = Service(executable_path=ChromeDriverManager().install())
  1. Use Service instance when initializing the driver:
driver = webdriver.Chrome(service=service)
  1. Import WebDriver Manager Package
using WebDriverManager;
using WebDriverManager.DriverConfigs.Impl;
  1. Use the SetUpDriver() which requires a config class:
new DriverManager().SetUpDriver(new ChromeConfig());
  1. Initialize your driver as you normally would:
var driver = new ChromeDriver()
  1. Add webdrivers gem to Gemfile:
gem 'webdrivers', '~> 5.0'
  1. Require webdrivers in your project:
require 'webdrivers'

3 Initialize driver as you normally would:

driver = Selenium::WebDriver.for :chrome
There is not a recommended driver manager for JavaScript at this time
  1. Import WebDriver Manager
import io.github.bonigarcia.wdm.WebDriverManager;
  1. Call the setup method before initializing the driver as you normally would:
fun chrome(): WebDriver {
    WebDriverManager.chromedriver().setup()
    return ChromeDriver()
}

2. The PATH Environment Variable

This option first requires manually downloading the driver (See Quick Reference Section for links).

This is a flexible option to change location of drivers without having to update your code, and will work on multiple machines without requiring that each machine put the drivers in the same place.

You can either place the drivers in a directory that is already listed in PATH, or you can place them in a directory and add it to PATH.

To see what directories are already on PATH, open a Terminal and execute:

echo $PATH

If the location to your driver is not already in a directory listed, you can add a new directory to PATH:

echo 'export PATH=$PATH:/path/to/driver' >> ~/.bash_profile
source ~/.bash_profile

You can test if it has been added correctly by starting the driver:

chromedriver

To see what directories are already on PATH, open a Terminal and execute:

echo $PATH

If the location to your driver is not already in a directory listed, you can add a new directory to PATH:

echo 'export PATH=$PATH:/path/to/driver' >> ~/.zshenv
source ~/.zshenv

You can test if it has been added correctly by starting the driver:

chromedriver

To see what directories are already on PATH, open a Command Prompt and execute:

echo %PATH%

If the location to your driver is not already in a directory listed, you can add a new directory to PATH:

setx PATH "%PATH%;C:\WebDriver\bin"

You can test if it has been added correctly by starting the driver:

chromedriver.exe

If your PATH is configured correctly above, you will see some output relating to the startup of the driver:

Starting ChromeDriver 95.0.4638.54 (d31a821ec901f68d0d34ccdbaea45b4c86ce543e-refs/branch-heads/4638@{#871}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.

You can regain control of your command prompt by pressing Ctrl+C

3. Hard Coded Location

Similar to Option 2 above, you need to manually download the driver (See Quick Reference Section for links). Specifying the location in the code itself has the advantage of not needing to figure out Environment Variables on your system, but has the drawback of making the code much less flexible.

System.setProperty("webdriver.chrome.driver","/path/to/chromedriver");
ChromeDriver driver = new ChromeDriver();
from selenium.webdriver.chrome.service import Service
from selenium import webdriver

service = Service(executable_path="/path/to/chromedriver")
driver = webdriver.Chrome(service=service)
var driver = new ChromeDriver(@"C:\WebDriver\bin");
service = Selenium::WebDriver::Service.chrome(path: '/path/to/chromedriver')
driver = Selenium::WebDriver.for :chrome, service: service
const {Builder} = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');

const service = new chrome.ServiceBuilder('/path/to/chromedriver');
const driver = new Builder().forBrowser('chrome').setChromeService(service).build();
import org.openqa.selenium.chrome.ChromeDriver

fun main(args: Array<String>) {
    System.setProperty("webdriver.chrome.driver", "/path/to/chromedriver")
    val driver = ChromeDriver()
}

Advanced Configuration

More information on how you can change the driver behavior can be found on the Configuring driver parameters page.

Next Step

Open and close a browser

2.1.3 - Open and close a browser with Selenium

Code examples for starting and stopping a session with each browser.

Once you have a Selenium library installed, and your desired browser driver, you can start and stop a session with a browser.

Typically, browsers are started with specific options that describe which capabilities the browser must support, and how the browser should behave during the session. Some capabilities are shared by all browsers, and some will be specific to the browser being used. This page will show examples of starting a browser with the default capabilities.

After learning how to start a session, check out the next session on how to write your first Selenium script

Chrome

By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of the Chrome browser and the version of chromedriver must match the major version.

In addition to the shared capabilities, there are specific Chrome capabilities that can be used.

  ChromeOptions options = new ChromeOptions();
  driver = new ChromeDriver(options);
  
  driver.quit();
  
  options = ChromeOptions()
  driver = webdriver.Chrome(options=options)

  driver.quit()
  
  var options = new ChromeOptions();
  var driver = new ChromeDriver(options);

  driver.Quit();
  
  options = Selenium::WebDriver::Options.chrome
  driver = Selenium::WebDriver.for :chrome, options: options

  driver.quit
  
const {Builder} = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');

(async function openChromeTest() {
  try {
    let options = new chrome.Options();
    let driver = await new Builder()
                .setChromeOptions(options)
                .forBrowser('chrome')
                .build();
    await driver.get('https://www.google.com');
    await driver.quit();
  } catch (error) {
    console.log(error)
  }
})();
  val options = ChromeOptions()
  val driver = ChromeDriver(options)

  driver.quit()
  

Edge

Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome, the major version number of edgedriver must match the major version of the Edge browser.

  EdgeOptions options = new EdgeOptions();
  driver = new EdgeDriver(options);

  driver.quit();
  
  options = EdgeOptions()
  driver = webdriver.Edge(options=options)

  driver.quit()
  
  var options = new EdgeOptions();
  var driver = new EdgeDriver(options);
  
  driver.Quit();
  
  options = Selenium::WebDriver::Options.edge
  driver = Selenium::WebDriver.for :edge, options: options

  driver.quit
  
const {Builder} = require('selenium-webdriver');
const edge = require('selenium-webdriver/edge');

(async function openEdgeTest() {
  try {
    let options = new edge.Options();
    let driver = await new Builder()
                .setChromeOptions(options)
                .forBrowser('edge')
                .build();
    await driver.get('https://www.google.com');
    await driver.quit();
  } catch (error) {
    console.log(error)
  }
})();
  val options = EdgeOptions()
  val driver = EdgeDriver(options)

  driver.quit()
  

Firefox

Selenium 4 requires Firefox 78 or greater. It is recommended to always use the latest version of geckodriver.

  FirefoxOptions options = new FirefoxOptions();
  driver = new FirefoxDriver(options);

  driver.quit();
  
  options = FirefoxOptions()
  driver = webdriver.Firefox(options=options)

  driver.quit()
  
  var options = new FirefoxOptions();
  var driver = new FirefoxDriver(options);
  
  driver.Quit();
  
  options = Selenium::WebDriver::Options.firefox
  driver = Selenium::WebDriver.for :firefox, options: options

  driver.quit
  
const {Builder} = require('selenium-webdriver');
const firefox = require('selenium-webdriver/firefox');

(async function openFirefoxTest() {
  try {
    let options = new firefox.Options();
    let driver = await new Builder()
                .setChromeOptions(options)
                .forBrowser('firefox')
                .build();
    await driver.get('https://www.google.com');
    await driver.quit();
  } catch (error) {
    console.log(error)
  }
})();
  val options = FirefoxOptions()
  val driver = FirefoxDriver(options)

  driver.quit()
  

Internet Explorer

The IE Driver is the only driver maintained by the Selenium Project directly. While binaries for both the 32-bit and 64-bit versions of Internet Explorer are available, there are some limitations with the 64-bit driver. As such it is recommended to use the 32-bit driver.

Legacy

The Selenium project aims to support the same releases that Microsoft considers current. Older releases may work, but will not be supported. Note that Internet Explorer 11 will end support for certain operating systems, including Windows 10 on June 15, 2022.

It should be noted that as Internet Explorer preferences are saved against the logged-in user’s account, some additional setup is required.

Additional information about using Internet Explorer can be found on the Selenium wiki

  InternetExplorerOptions options = new InternetExplorerOptions();
  driver = new InternetExplorerDriver(options);

  driver.quit();
  
  options = IEOptions()
  driver = webdriver.Ie(options=options)

  driver.quit()
  
  var options = new InternetExplorerOptions();
  var driver = new InternetExplorerDriver(options);
  
  driver.Quit();
  
  options = Selenium::WebDriver::Options.ie
  driver = Selenium::WebDriver.for :ie, options: options

  driver.quit
  
  const { Builder } = require("selenium-webdriver");
  const ie = require('selenium-webdriver/ie');

  let options = new ie.Options();
  let driver = await new Builder()
    .forBrowser('internetExplorer')
    .setIeOptions(options)
    .build();

  await driver.quit();
 
  val options = InternetExplorerOptions()
  val driver = InternetExplorerDriver(options)

  driver.quit()
  

Compatibility Mode

Microsoft Edge can be used in IE compatibility mode using the IE Driver.

  InternetExplorerOptions options = new InternetExplorerOptions();
  options.attachToEdgeChrome();
  options.withEdgeExecutablePath("/path/to/edge/browser");
  
  driver = new InternetExplorerDriver(options);
  
  driver.quit();
  
  options = IEOptions()
  options.attach_to_edge_chrome = True
  options.edge_executable_path = "/path/to/edge/browser"
  driver = webdriver.Ie(options=options)

  driver.quit()
  
  var options = new InternetExplorerOptions
  {
    AttachToEdgeChrome = true,
    EdgeExecutablePath = "/path/to/edge/browser"
  };
  var driver = new InternetExplorerDriver(options);
  
  driver.Quit();
  
  options = Selenium::WebDriver::Options.ie
  options.attach_to_edge_chrome = true
  options.edge_executable_path = "/path/to/edge/browser"
  driver = Selenium::WebDriver.for :ie, options: options

  driver.quit
  
  let options = new ie.Options();
  options.setEdgeChromium(true);
  options.setEdgePath("/path/to/edge/browser);

  let driver = await new Builder()
    .forBrowser('internet explorer')
    .setIEOptions(options)
    .build();

  await driver.quit();
  
  val options = InternetExplorerOptions()
  options.attachToEdgeChrome()
  options.withEdgeExecutablePath("/path/to/edge/browser")
  val driver = InternetExplorerDriver(options)

  driver.quit()
  

Opera

Since the opera driver does not support w3c syntax, but is based on Chromium, it is recommended to drive Opera browser with the chromedriver. Like all Chromium implementations, make sure that the browser version matches the driver version.

  ChromeOptions options = new ChromeOptions();
  options.setBinary("/path/to/opera/browser");
  
  driver = new ChromeDriver(options);
  
  driver.quit();
  
  options = ChromeOptions()
  options.binary_location = "path/to/opera/browser"
  driver = webdriver.Chrome(options=options)

  driver.quit()
  
  var options = new ChromeOptions
  {
    BinaryLocation = "/path/to/opera/browser"
  };
  var driver = new ChromeDriver(options);
  
  driver.Quit();
  
  options = Selenium::WebDriver::Options.chrome
  options.binary = '/path/to/opera/browser'
  driver = Selenium::WebDriver.for :chrome, options: options

  driver.quit
  
  const { Builder } = require("selenium-webdriver");
  const chrome = require('selenium-webdriver/chrome');

  let options = new chrome.Options();
  options.setChromeBinaryPath("/path/to/opera/browser");

  let driver = await new Builder()
    .forBrowser('chrome')
    .setChromeOptions(options)
    .build();

  await driver.quit();
  
  val options = ChromeOptions()
  options.setBinary("/path/to/opera/browser")
  val driver = ChromeDriver(options)
  
  driver.quit()
  

Safari

Desktop

Unlike Chromium and Firefox drivers, the safaridriver is installed with the Operating System. To enable automation on Safari, run the following command from the terminal:

safaridriver --enable
  SafariOptions options = new SafariOptions();
  driver = new SafariDriver(options);
  
  driver.quit();
  driver = webdriver.Safari()

  driver.quit()
  
  var options = new SafariOptions();
  var driver = new SafariDriver(options);

  driver.Quit();
  
  options = Selenium::WebDriver::Options.safari
  driver = Selenium::WebDriver.for :safari, options: options

  driver.quit
  
  const { Builder } = require("selenium-webdriver");
  const safari = require('selenium-webdriver/safari');

  let options = new safari.Options();
  let driver = await new Builder()
    .forBrowser('safari')
    .setSafariOptions(options)
    .build();

  await driver.quit();
  
  val options = SafariOptions()
  val driver = SafariDriver(options)

  driver.quit()
  

Mobile

Those looking to automate Safari on iOS should look to the Appium project.

Next Step

Create your first Selenium script

2.1.4 - Write your first Selenium script

Step-by-step instructions for constructing a Selenium script

Once you have Selenium installed and Drivers installed, you’re ready to write Selenium code.

Eight Basic Components

Everything Selenium does is send the browser commands to do something or send requests for information. Most of what you’ll do with Selenium is a combination of these basic commands:

1. Start the session

For more details on starting a session read our documentation on opening and closing a browser

        driver = new ChromeDriver();
    driver = webdriver.Chrome()
            var driver = new ChromeDriver();
    driver = Selenium::WebDriver.for :chrome
    let driver = await new Builder().forBrowser('chrome').build();
        driver = ChromeDriver()

2. Take action on browser

In this example we are navigating to a web page.

        driver.get("https://google.com");
    driver.get("https://google.com")
            driver.Navigate().GoToUrl("https://google.com");
    driver.get('https://google.com')
    await driver.get('https://www.google.com');
        driver.get("https://google.com")

3. Request browser information

There are a bunch of types of information about the browser you can request, including window handles, browser size / position, cookies, alerts, etc.

        String title = driver.getTitle();
    title = driver.title
            var title = driver.Title;
    title = driver.title
    await driver.getTitle();
        title = driver.title

4. Establish Waiting Strategy

Synchronizing the code with the current state of the browser is one of the biggest challenges with Selenium, and doing it well is an advanced topic.

Essentially you want to make sure that the element is on the page before you attempt to locate it and the element is in an interactable state before you attempt to interact with it.

An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so we’ll use it as a placeholder.

Read more about Waiting strategies.

        driver.manage().timeouts().implicitlyWait(Duration.ofMillis(500));
    driver.implicitly_wait(0.5)
            driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(500);
    driver.manage.timeouts.implicit_wait = 500
    await driver.manage().setTimeouts({implicit: 1000})
        driver.manage().timeouts().implicitlyWait(Duration.ofMillis(500))

5. Find an element

The majority of commands in most Selenium sessions are element related, and you can’t interact with one without first finding an element

        WebElement searchBox = driver.findElement(By.name("q"));
        WebElement searchButton = driver.findElement(By.name("btnK"));
    search_box = driver.find_element(by=By.NAME, value="q")
    search_button = driver.find_element(by=By.NAME, value="btnK")
            var searchBox = driver.FindElement(By.Name("q"));
            var searchButton = driver.FindElement(By.Name("btnK"));
    search_box = driver.find_element(name: 'q')
    search_button = driver.find_element(name: 'btnK')
    let searchBox = await driver.findElement(By.name('q'));
    let searchButton = await driver.findElement(By.name('btnK'));
        var searchBox = driver.findElement(By.name("q"))
        val searchButton = driver.findElement(By.name("btnK"))

6. Take action on element

There are only a handful of actions to take on an element, but you will use them frequently.

        searchBox.sendKeys("Selenium");
        searchButton.click();
    search_box.send_keys("Selenium")
    search_button.click()
            searchBox.SendKeys("Selenium");
            searchButton.Click();
    search_box.send_keys('Selenium')
    search_button.click
    await searchBox.sendKeys('Selenium');
    await searchButton.click();
        searchBox.sendKeys("Selenium")
        searchButton.click()

7. Request element information

Elements store a lot of information that can be requested. Notice that we need to relocate the search box because the DOM has changed since we first located it.

        String value = searchBox.getAttribute("value");
    value = search_box.get_attribute("value")
            var value = searchBox.GetAttribute("value");
    value = search_box.attribute('value')
    let value = await searchBox.getAttribute("value");
        val value = searchBox.getAttribute("value")

8. End the session

This ends the driver process, which by default closes the browser as well. No more commands can be sent to this driver instance.

        driver.quit();
    driver.quit()
            driver.Quit();
    await driver.quit();
        driver.quit()

Putting everything together

Let’s combine these 8 things into a complete script with assertions that can be executed by a test runner.

package dev.selenium.getting_started;

import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

import java.time.Duration;

public class FirstScriptTest {
    public WebDriver driver;

    @Test
    public void eightComponents() {
        driver = new ChromeDriver();

        driver.get("https://google.com");

        String title = driver.getTitle();
        Assertions.assertEquals("Google", title);

        driver.manage().timeouts().implicitlyWait(Duration.ofMillis(500));

        WebElement searchBox = driver.findElement(By.name("q"));
        WebElement searchButton = driver.findElement(By.name("btnK"));

        searchBox.sendKeys("Selenium");
        searchButton.click();

        searchBox = driver.findElement(By.name("q"));
        String value = searchBox.getAttribute("value");
        Assertions.assertEquals("Selenium", value);

        driver.quit();
    }
}
from selenium import webdriver
from selenium.webdriver.common.by import By


def test_eight_components():
    driver = webdriver.Chrome()

    driver.get("https://google.com")

    title = driver.title
    assert title == "Google"

    driver.implicitly_wait(0.5)

    search_box = driver.find_element(by=By.NAME, value="q")
    search_button = driver.find_element(by=By.NAME, value="btnK")

    search_box.send_keys("Selenium")
    search_button.click()

    search_box = driver.find_element(by=By.NAME, value="q")
    value = search_box.get_attribute("value")
    assert value == "Selenium"

    driver.quit()
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace SeleniumDocs.GettingStarted
{
    [TestClass]
    public class FirstScriptTest
    {

        [TestMethod]
        public void ChromeSession()
        {
            var driver = new ChromeDriver();

            driver.Navigate().GoToUrl("https://google.com");

            var title = driver.Title;
            Assert.AreEqual("Google", title);

            driver.Manage().Timeouts().ImplicitWait = TimeSpan.FromMilliseconds(500);

            var searchBox = driver.FindElement(By.Name("q"));
            var searchButton = driver.FindElement(By.Name("btnK"));
            
            searchBox.SendKeys("Selenium");
            searchButton.Click();
            
            searchBox = driver.FindElement(By.Name("q"));
            var value = searchBox.GetAttribute("value");
            Assert.AreEqual("Selenium", value);

            driver.Quit();
        }
    }
}
# frozen_string_literal: true

RSpec.describe 'First Script' do
  it 'uses eight components' do
    driver = Selenium::WebDriver.for :chrome

    driver.get('https://google.com')

    title = driver.title
    expect(title).to eq('Google')

    driver.manage.timeouts.implicit_wait = 500

    search_box = driver.find_element(name: 'q')
    search_button = driver.find_element(name: 'btnK')

    search_box.send_keys('Selenium')
    search_button.click

    search_box = driver.find_element(name: 'q')
    value = search_box.attribute('value')
    expect(value).to eq('Selenium')

    driver.quit
  end
end
const {Builder, By} = require('selenium-webdriver');
const assert = require('assert');

(async function firstScript() {
  try {
    let driver = await new Builder().forBrowser('chrome').build();

    await driver.get('https://www.google.com');

    await driver.getTitle();

    await driver.manage().setTimeouts({implicit: 1000})

    let searchBox = await driver.findElement(By.name('q'));
    let searchButton = await driver.findElement(By.name('btnK'));

    await searchBox.sendKeys('Selenium');
    await searchButton.click();

    let value = await searchBox.getAttribute("value");
    assert.deepStrictEqual(value, "Selenium")

    await driver.quit();
  } catch (error) {
    console.log(error)
  }
})();
package dev.selenium.getting_started

import io.github.bonigarcia.wdm.WebDriverManager
import org.junit.jupiter.api.*
import org.junit.jupiter.api.Assertions.assertEquals
import org.openqa.selenium.By
import org.openqa.selenium.WebDriver
import org.openqa.selenium.chrome.ChromeDriver
import java.time.Duration

@TestInstance(TestInstance.Lifecycle.PER_CLASS)
class FirstScriptTest {
    private lateinit var driver: WebDriver

    @Test
    fun eightComponents() {
        driver = ChromeDriver()

        driver.get("https://google.com")

        title = driver.title
        assertEquals("Google", title)

        driver.manage().timeouts().implicitlyWait(Duration.ofMillis(500))

        var searchBox = driver.findElement(By.name("q"))
        val searchButton = driver.findElement(By.name("btnK"))

        searchBox.sendKeys("Selenium")
        searchButton.click()

        searchBox = driver.findElement(By.name("q"))
        val value = searchBox.getAttribute("value")
        assertEquals("Selenium", value)

        driver.quit()
    }
}

Next Steps

Take what you’ve learned and build out your Selenium code.

As you find more functionality that you need, read up on the rest of our WebDriver documentation.

2.1.5 - Upgrade to Selenium 4

Are you still using Selenium 3? This guide will help you upgrade to the latest release!

Upgrading to Selenium 4 should be a painless process if you are using one of the officially supported languages (Ruby, JavaScript, C#, Python, and Java). There might be some cases where a few issues can happen, and this guide will help you to sort them out. We will go through the steps to upgrade your project dependencies and understand the major deprecations and changes the version upgrade brings.

These are the steps we will follow to upgrade to Selenium 4:

  • Preparing our test code
  • Upgrading dependencies
  • Potential errors and deprecation messages

Note: while Selenium 3.x versions were being developed, support for the W3C WebDriver standard was implemented. Both this new protocol and the legacy JSON Wire Protocol were supported. Around version 3.11, Selenium code became compliant with the level W3C 1 specification. The W3C compliant code in the latest version of Selenium 3 will work as expected in Selenium 4.

Preparing our test code

Selenium 4 removes support for the legacy protocol and uses the W3C WebDriver standard by default under the hood. For most things, this implementation will not affect end users. The major exceptions are Capabilities and the Actions class.

Capabilities

If the test capabilities are not structured to be W3C compliant, may cause a session to not be started. Here is the list of W3C WebDriver standard capabilities:

  • browserName
  • browserVersion (replaces version)
  • platformName (replaces platform)
  • acceptInsecureCerts
  • pageLoadStrategy
  • proxy
  • timeouts
  • unhandledPromptBehavior

An up-to-date list of standard capabilities can be found at W3C WebDriver.

Any capability that is not contained in the list above, needs to include a vendor prefix. This applies to browser specific capabilities as well as cloud vendor specific capabilities. For example, if your cloud vendor uses build and name capabilities for your tests, you need to wrap them in a cloud:options block (check with your cloud vendor for the appropriate prefix).

Before

DesiredCapabilities caps = DesiredCapabilities.firefox();
caps.setCapability("platform", "Windows 10");
caps.setCapability("version", "92");
caps.setCapability("build", myTestBuild);
caps.setCapability("name", myTestName);
WebDriver driver = new RemoteWebDriver(new URL(cloudUrl), caps);
caps = {};
caps['browserName'] = 'Firefox';
caps['platform'] = 'Windows 10';
caps['version'] = '92';
caps['build'] = myTestBuild;
caps['name'] = myTestName;
DesiredCapabilities caps = new DesiredCapabilities();
caps.SetCapability("browserName", "firefox");
caps.SetCapability("platform", "Windows 10");
caps.SetCapability("version", "92");
caps.SetCapability("build", myTestBuild);
caps.SetCapability("name", myTestName);
var driver = new RemoteWebDriver(new Uri(CloudURL), caps);
caps = Selenium::WebDriver::Remote::Capabilities.firefox
caps[:platform] = 'Windows 10'
caps[:version] = '92'
caps[:build] = my_test_build
caps[:name] = my_test_name
driver = Selenium::WebDriver.for :remote, url: cloud_url, desired_capabilities: caps
caps = {}
caps['browserName'] = 'firefox'
caps['platform'] = 'Windows 10'
caps['version'] = '92'
caps['build'] = my_test_build
caps['name'] = my_test_name
driver = webdriver.Remote(cloud_url, desired_capabilities=caps)

After

FirefoxOptions browserOptions = new FirefoxOptions();
browserOptions.setPlatformName("Windows 10");
browserOptions.setBrowserVersion("92");
Map<String, Object> cloudOptions = new HashMap<>();
cloudOptions.put("build", myTestBuild);
cloudOptions.put("name", myTestName);
browserOptions.setCapability("cloud:options", cloudOptions);
WebDriver driver = new RemoteWebDriver(new URL(cloudUrl), browserOptions);
capabilities = {
  browserName: 'firefox',
  browserVersion: '92',
  platformName: 'Windows 10',
  'cloud:options': {
     build: myTestBuild,
     name: myTestName,
  }
}
var browserOptions = new FirefoxOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "92";
var cloudOptions = new Dictionary<string, object>();
cloudOptions.Add("build", myTestBuild);
cloudOptions.Add("name", myTestName);
browserOptions.AddAdditionalOption("cloud:options", cloudOptions);
var driver = new RemoteWebDriver(new Uri(CloudURL), browserOptions);
options = Selenium::WebDriver::Options.firefox
options.browser_version = 'latest'
options.platform_name = 'Windows 10'
cloud_options = {}
cloud_options[:build] = my_test_build
cloud_options[:name] = my_test_name
options.add_option('cloud:options', cloud_options)
driver = Selenium::WebDriver.for :remote, url: cloud_url, capabilities: options
from selenium.webdriver.firefox.options import Options as FirefoxOptions
options = FirefoxOptions()
options.browser_version = '92'
options.platform_name = 'Windows 10'
cloud_options = {}
cloud_options['build'] = my_test_build
cloud_options['name'] = my_test_name
options.set_capability('cloud:options', cloud_options)
driver = webdriver.Remote(cloud_url, options=options)

Find element(s) utility methods in Java

The utility methods to find elements in the Java bindings (FindsBy interfaces) have been removed as they were meant for internal use only. The following code samples explain this better.

Finding a single element with findElement*

Before

driver.findElementByClassName("className");
driver.findElementByCssSelector(".className");
driver.findElementById("elementId");
driver.findElementByLinkText("linkText");
driver.findElementByName("elementName");
driver.findElementByPartialLinkText("partialText");
driver.findElementByTagName("elementTagName");
driver.findElementByXPath("xPath");

After

driver.findElement(By.className("className"));
driver.findElement(By.cssSelector(".className"));
driver.findElement(By.id("elementId"));
driver.findElement(By.linkText("linkText"));
driver.findElement(By.name("elementName"));
driver.findElement(By.partialLinkText("partialText"));
driver.findElement(By.tagName("elementTagName"));
driver.findElement(By.xpath("xPath"));

Finding a multiple elements with findElements*

Before

driver.findElementsByClassName("className");
driver.findElementsByCssSelector(".className");
driver.findElementsById("elementId");
driver.findElementsByLinkText("linkText");
driver.findElementsByName("elementName");
driver.findElementsByPartialLinkText("partialText");
driver.findElementsByTagName("elementTagName");
driver.findElementsByXPath("xPath");

After

driver.findElements(By.className("className"));
driver.findElements(By.cssSelector(".className"));
driver.findElements(By.id("elementId"));
driver.findElements(By.linkText("linkText"));
driver.findElements(By.name("elementName"));
driver.findElements(By.partialLinkText("partialText"));
driver.findElements(By.tagName("elementTagName"));
driver.findElements(By.xpath("xPath"));

Upgrading dependencies

Check the subsections below to install Selenium 4 and have your project dependencies upgraded.

Java

The process of upgrading Selenium depends on which build tool is being used. We will cover the most common ones for Java, which are Maven and Gradle. The minimum Java version required is still 8.

Maven

Before

<dependencies>
  <!-- more dependencies ... -->
  <dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>3.141.59</version>
  </dependency>
  <!-- more dependencies ... -->
</dependencies>

After

<dependencies>
    <!-- more dependencies ... -->
    <dependency>
        <groupId>org.seleniumhq.selenium</groupId>
        <artifactId>selenium-java</artifactId>
        <version>4.0.0</version>
    </dependency>
    <!-- more dependencies ... -->
</dependencies>

After making the change, you could execute mvn clean compile on the same directory where the pom.xml file is.

Gradle

Before

plugins {
    id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
    mavenCentral()
}
dependencies {
    testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
    testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
    implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '3.141.59'
}
test {
    useJUnitPlatform()
}

After

plugins {
    id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
    mavenCentral()
}
dependencies {
    testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
    testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
    implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.0.0'
}
test {
    useJUnitPlatform()
}

After making the change, you could execute ./gradlew clean build on the same directory where the build.gradle file is.

To check all the Java releases, you can head to MVNRepository.

C#

The place to get updates for Selenium 4 in C# is NuGet. Under the Selenium.WebDriver package you can get the instructions to update to the latest version. Inside of Visual Studio, through the NuGet Package Manager you can execute:

PM> Install-Package Selenium.WebDriver -Version 4.0.0

Python

The most important change to use Python is the minimum required version. Selenium 4 will require a minimum Python 3.7 or higher. More details can be found at the Python Package Index. To upgrade from the command line, you can execute:

pip install selenium==4.0.0

Ruby

The update details for Selenium 4 can be seen at the selenium-webdriver gem in RubyGems. To install the latest version, you can execute:

gem install selenium-webdriver

To add it to your Gemfile:

gem 'selenium-webdriver', '~> 4.0.0'

JavaScript

The selenium-webdriver package can be found at the Node package manager, npmjs. Selenium 4 can be found here. To install it, you could either execute:

npm install selenium-webdriver

Or, update your package.json and run npm install:

{
  "name": "selenium-tests",
  "version": "1.0.0",
  "dependencies": {
    "selenium-webdriver": "^4.0.0"
  }
}

Potential errors and deprecation messages

Here is a set of code examples that will help to overcome the deprecation messages you might encounter after upgrading to Selenium 4.

Java

Waits and Timeout

The parameters received in Timeout have switched from expecting (long time, TimeUnit unit) to expect (Duration duration).

Before

driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
driver.manage().timeouts().setScriptTimeout(2, TimeUnit.MINUTES);
driver.manage().timeouts().pageLoadTimeout(10, TimeUnit.SECONDS);

After

driver.manage().timeouts().implicitlyWait(Duration.ofSeconds(10));
driver.manage().timeouts().scriptTimeout(Duration.ofMinutes(2));
driver.manage().timeouts().pageLoadTimeout(Duration.ofSeconds(10));

Waits are also expecting different parameters now. WebDriverWait is now expecting a Duration instead of a long for timeout in seconds and milliseconds. The withTimeout and pollingEvery utility methods from FluentWait have switched from expecting (long time, TimeUnit unit) to expect (Duration duration).

Before

new WebDriverWait(driver, 3)
.until(ExpectedConditions.elementToBeClickable(By.cssSelector("#id")));

Wait<WebDriver> wait = new FluentWait<WebDriver>(driver)
  .withTimeout(30, TimeUnit.SECONDS)
  .pollingEvery(5, TimeUnit.SECONDS)
  .ignoring(NoSuchElementException.class);

After

new WebDriverWait(driver, Duration.ofSeconds(3))
  .until(ExpectedConditions.elementToBeClickable(By.cssSelector("#id")));

  Wait<WebDriver> wait = new FluentWait<WebDriver>(driver)
  .withTimeout(Duration.ofSeconds(30))
  .pollingEvery(Duration.ofSeconds(5))
  .ignoring(NoSuchElementException.class);

Merging capabilities is no longer changing the calling object

It was possible to merge a different set of capabilities into another set, and it was mutating the calling object. Now, the result of the merge operation needs to be assigned.

Before

MutableCapabilities capabilities = new MutableCapabilities();
capabilities.setCapability("platformVersion", "Windows 10");
FirefoxOptions options = new FirefoxOptions();
options.setHeadless(true);
options.merge(capabilities);

As a result, the options object was getting modified.

After

MutableCapabilities capabilities = new MutableCapabilities();
capabilities.setCapability("platformVersion", "Windows 10");
FirefoxOptions options = new FirefoxOptions();
options.setHeadless(true);
options = options.merge(capabilities);

The result of the merge call needs to be assigned to an object.

Firefox Legacy

Before GeckoDriver was around, the Selenium project had a driver implementation to automate Firefox (version <48). However, this implementation is not needed anymore as it does not work in recent versions of Firefox. To avoid major issues when upgrading to Selenium 4, the setLegacy option will be shown as deprecated. The recommendation is to stop using the old implementation and rely only on GeckoDriver. The following code will show the setLegacy line deprecated after upgrading.

FirefoxOptions options = new FirefoxOptions();
options.setLegacy(true);

BrowserType

The BrowserType interface has been around for a long time, however it is getting deprecated in favour of the new Browser interface.

Before

MutableCapabilities capabilities = new MutableCapabilities();
capabilities.setCapability("browserVersion", "92");
capabilities.setCapability("browserName", BrowserType.FIREFOX);

After

MutableCapabilities capabilities = new MutableCapabilities();
capabilities.setCapability("browserVersion", "92");
capabilities.setCapability("browserName", Browser.FIREFOX);

C#

AddAdditionalCapability is deprecated

Instead of it, AddAdditionalOption is recommended. Here is an example showing this:

Before

var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalCapability("cloud:options", cloudOptions, true);

After

var browserOptions = new ChromeOptions();
browserOptions.PlatformName = "Windows 10";
browserOptions.BrowserVersion = "latest";
var cloudOptions = new Dictionary<string, object>();
browserOptions.AddAdditionalOption("cloud:options", cloudOptions);

Python

executable_path has been deprecated, please pass in a Service object

In Selenium 4, you’ll need to set the driver’s executable_path from a Service object to prevent deprecation warnings. (Or don’t set the path and instead make sure that the driver you need is on the System PATH.)

Before

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
driver = webdriver.Chrome(executable_path=CHROMEDRIVER_PATH, options=options)

After

from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
service = ChromeService(executable_path=CHROMEDRIVER_PATH)
driver = webdriver.Chrome(service=service, options=options)

Summary

We went through the major changes to be taken into consideration when upgrading to Selenium 4. Covering the different aspects to cover when test code is prepared for the upgrade, including suggestions on how to prevent potential issues that can show up when using the new version of Selenium. To finalize, we also covered a set of possible issues that you can bump into after upgrading, and we shared potential fixes for those issues.

This was originally posted at https://saucelabs.com/resources/articles/how-to-upgrade-to-selenium-4

2.2 - WebDriver Capabilities

2.2.1 - Shared capabilities

These capabilities are shared by all browsers.

In order to create a new session by Selenium WebDriver, the local end should provide the basic capabilities to the remote end. The remote end uses the same set of capabilities to create a session and describes the current session features.

WebDriver provides capabilities that each remote end will/should support the implementation. The following capabilities are supported by WebDriver:

browserName:

This capability is used to set the browserName for a given session. If the specified browser is not installed at the remote end, the session creation will fail.

browserVersion:

This capability is optional, this is used to set the available browser version at remote end. For Example, if ask for Chrome version 75 on a system that only has 80 installed, the session creation will fail.

pageLoadStrategy:

When navigating to a new page via URL, by default Selenium will wait until the Document’s Ready State is “complete.” The document.readyState property of a document describes the loading state of the current document. By default, WebDriver will hold off on completing a navigation method (e.g., driver.navigate().get()) until the document ready state is complete. This does not necessarily mean that the page has finished loading, especially for sites like Single Page Applications that use a lot of JavaScript to dynamically load content after the Ready State returns complete. Note also that this behavior does not apply to navigation that is a result of clicking an element or submitting a form.

If a page takes a long time to load as a result of downloading assets (e.g., images, css, js) that aren’t important to the automation, you can change from the default parameter of normal to eager or none to speed up the session. This value applies to the entire session, so make sure that your waiting strategy is sufficient to minimize flakiness.

The page load strategy queries the document.readyState as described in the table below:

StrategyReady StateNotes
normalcompleteUsed by default, waits for all resources to download
eagerinteractiveDOM access is ready, but other resources like images may still be loading
noneAnyDoes not block WebDriver at all

normal

This will make Selenium WebDriver to wait for the entire page is loaded. When set to normal, Selenium WebDriver waits until the load event fire is returned.

By default normal is set to browser if none is provided.

import org.openqa.selenium.PageLoadStrategy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.chrome.ChromeDriver;

public class pageLoadStrategy {
public static void main(String[] args) {
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL);
WebDriver driver = new ChromeDriver(chromeOptions);
try {
// Navigate to Url
driver.get("https://google.com");
} finally {
driver.quit();
}
}
}
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'normal'
driver = webdriver.Chrome(options=options)
driver.get("http://www.google.com")
driver.quit()
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace pageLoadStrategy {
class pageLoadStrategy {
public static void Main(string[] args) {
var chromeOptions = new ChromeOptions();
chromeOptions.PageLoadStrategy = PageLoadStrategy.Normal;
IWebDriver driver = new ChromeDriver(chromeOptions);
try {
driver.Navigate().GoToUrl("https://example.com");
} finally {
driver.Quit();
}
}
}
}
require 'selenium-webdriver'
caps = Selenium::WebDriver::Remote::Capabilities.chrome
caps.page_load_strategy='normal'

driver = Selenium::WebDriver.for :chrome, :desired_capabilities => caps
driver.get('https://www.google.com')
const {Builder, Capabilities} = require('selenium-webdriver');
const caps = new Capabilities();

// Set page load strategy to Normal
caps.setPageLoadStrategy("normal");

(async function normalPageLoad() {
    let driver = await new Builder().
    withCapabilities(caps).
    forBrowser('chrome').
    build();
    try {
        // Navigate to Url
        await driver.get('https://www.google.com');
    }
    finally {
        await driver.quit();
    }
})();
import org.openqa.selenium.PageLoadStrategy
import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.chrome.ChromeOptions

fun main() {
val chromeOptions = ChromeOptions()
chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL)
val driver = ChromeDriver(chromeOptions)
try {
driver.get("https://www.google.com")
}
finally {
driver.quit()
}
}

eager

This will make Selenium WebDriver to wait until the initial HTML document has been completely loaded and parsed, and discards loading of stylesheets, images and subframes.

When set to eager, Selenium WebDriver waits until DOMContentLoaded event fire is returned.

import org.openqa.selenium.PageLoadStrategy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.chrome.ChromeDriver;

public class pageLoadStrategy {
public static void main(String[] args) {
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);
WebDriver driver = new ChromeDriver(chromeOptions);
try {
// Navigate to Url
driver.get("https://google.com");
} finally {
driver.quit();
}
}
}
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'eager'
driver = webdriver.Chrome(options=options)
driver.get("http://www.google.com")
driver.quit()
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace pageLoadStrategy {
class pageLoadStrategy {
public static void Main(string[] args) {
var chromeOptions = new ChromeOptions();
chromeOptions.PageLoadStrategy = PageLoadStrategy.Eager;
IWebDriver driver = new ChromeDriver(chromeOptions);
try {
driver.Navigate().GoToUrl("https://example.com");
} finally {
driver.Quit();
}
}
}
}
require 'selenium-webdriver'
caps = Selenium::WebDriver::Remote::Capabilities.chrome
caps.page_load_strategy='eager'

driver = Selenium::WebDriver.for :chrome, :desired_capabilities => caps
driver.get('https://www.google.com')
const {Builder, Capabilities} = require('selenium-webdriver');
const caps = new Capabilities();

// Set page load strategy to eager
caps.setPageLoadStrategy("eager");
(async function example() {
    let driver = await new Builder().
    withCapabilities(caps).
    forBrowser('chrome').
    build();
    try {
        // Navigate to Url
        await driver.get('https://www.google.com');
    } finally {
        await driver.quit();
    }
})();
import org.openqa.selenium.PageLoadStrategy
import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.chrome.ChromeOptions

fun main() {
val chromeOptions = ChromeOptions()
chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER)
val driver = ChromeDriver(chromeOptions)
try {
driver.get("https://www.google.com")
}
finally {
driver.quit()
}
}

none

When set to none Selenium WebDriver only waits until the initial page is downloaded.

import org.openqa.selenium.PageLoadStrategy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import org.openqa.selenium.chrome.ChromeDriver;

public class pageLoadStrategy {
public static void main(String[] args) {
ChromeOptions chromeOptions = new ChromeOptions();
chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE);
WebDriver driver = new ChromeDriver(chromeOptions);
try {
// Navigate to Url
driver.get("https://google.com");
} finally {
driver.quit();
}
}
}
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.page_load_strategy = 'none'
driver = webdriver.Chrome(options=options)
driver.get("http://www.google.com")
driver.quit()
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace pageLoadStrategy {
class pageLoadStrategy {
public static void Main(string[] args) {
var chromeOptions = new ChromeOptions();
chromeOptions.PageLoadStrategy = PageLoadStrategy.None;
IWebDriver driver = new ChromeDriver(chromeOptions);
try {
driver.Navigate().GoToUrl("https://example.com");
} finally {
driver.Quit();
}
}
}
}
require 'selenium-webdriver'
caps = Selenium::WebDriver::Remote::Capabilities.chrome
caps.page_load_strategy='none'

driver = Selenium::WebDriver.for :chrome, :desired_capabilities => caps
driver.get('https://www.google.com')
const {Builder, Capabilities} = require('selenium-webdriver');
const caps = new Capabilities();

// Set page load strategy to None
caps.setPageLoadStrategy("none");
(async function normalPageLoad() {
    let driver = await new Builder().
    withCapabilities(caps).
    forBrowser('chrome').
    build();
    try {
        // Navigate to Url
        await driver.get('https://www.google.com');
    }
    finally {
        await driver.quit();
    }
})();
import org.openqa.selenium.PageLoadStrategy
import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.chrome.ChromeOptions

fun main() {
val chromeOptions = ChromeOptions()
chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE)
val driver = ChromeDriver(chromeOptions)
try {
driver.get("https://www.google.com")
}
finally {
driver.quit()
}
}

platformName

This identifies the operating system at the remote-end, fetching the platformName returns the OS name.

In cloud-based providers, setting platformName sets the OS at the remote-end.

acceptInsecureCerts

This capability checks whether an expired (or) invalid TLS Certificate is used while navigating during a session.

If the capability is set to false, an insecure certificate error will be returned as navigation encounters any domain certificate problems. If set to true, invalid certificate will be trusted by the browser.

All self-signed certificates will be trusted by this capability by default. Once set, acceptInsecureCerts capability will have an effect for the entire session.

timeouts

A WebDriver session is imposed with a certain session timeout interval, during which the user can control the behaviour of executing scripts or retrieving information from the browser.

Each session timeout is configured with combination of different timeouts as described below:

Script Timeout:

Specifies when to interrupt an executing script in a current browsing context. The default timeout 30,000 is imposed when a new session is created by WebDriver.

Page Load Timeout:

Specifies the time interval in which web page needs to be loaded in a current browsing context. The default timeout 300,000 is imposed when a new session is created by WebDriver. If page load limits a given/default time frame, the script will be stopped by TimeoutException.

Implicit Wait Timeout

This specifies the time to wait for the implicit element location strategy when locating elements. The default timeout 0 is imposed when a new session is created by WebDriver.

unhandledPromptBehavior

Specifies the state of current session’s user prompt handler. Defaults to dismiss and notify state

User Prompt Handler

This defines what action must take when a user prompt encounters at the remote-end. This is defined by unhandledPromptBehavior capability and has the following states:

  • dismiss
  • accept
  • dismiss and notify
  • accept and notify
  • ignore

setWindowRect

This command alters the size and position of the current browsing context window. This command acts as setter to getWindowRect command which accepts width, height, x, y as optional arguments.

During automation, the current browsing context will be associated with window states, which describe the visibility of the browser window. The window states are

  • maximized
  • minimized
  • normal
  • fullscreen

Setting Width or Height does not guaranteed that the resulting window size will exactly match that which was requested. This is because some drivers may not be able to resize in single-pixel increments. Due to this, fetching the window state/details by getWindowRect may not match the values set in the browser.

strictFileInteractability

This new capability indicates if strict interactability checks should be applied to input type=file elements. As strict interactability checks are off by default, there is a change in behaviour when using Element Send Keys with hidden file upload controls.

proxy

A proxy server acts as an intermediary for requests between a client and a server. In simple, the traffic flows through the proxy server on its way to the address you requested and back.

A proxy server for automation scripts with Selenium could be helpful for:

  • Capture network traffic
  • Mock backend calls made by the website
  • Access the required website under complex network topologies or strict corporate restrictions/policies.

If you are in a corporate environment, and a browser fails to connect to a URL, this is most likely because the environment needs a proxy to be accessed.

Selenium WebDriver provides a way to proxy settings:

import org.openqa.selenium.Proxy;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;

public class proxyTest {
public static void main(String[] args) {
Proxy proxy = new Proxy();
proxy.setHttpProxy("<HOST:PORT>");
ChromeOptions options = new ChromeOptions();
options.setCapability("proxy", proxy);
WebDriver driver = new ChromeDriver(options);
driver.get("https://www.google.com/");
driver.manage().window().maximize();
driver.quit();
}
}
from selenium import webdriver

PROXY = "<HOST:PORT>"
webdriver.DesiredCapabilities.FIREFOX['proxy'] = {
"httpProxy": PROXY,
"ftpProxy": PROXY,
"sslProxy": PROXY,
"proxyType": "MANUAL",

}

with webdriver.Firefox() as driver:
driver.get("https://selenium.dev")
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

public class ProxyTest{
public static void Main() {
ChromeOptions options = new ChromeOptions();
Proxy proxy = new Proxy();
proxy.Kind = ProxyKind.Manual;
proxy.IsAutoDetect = false;
proxy.SslProxy = "<HOST:PORT>";
options.Proxy = proxy;
options.AddArgument("ignore-certificate-errors");
IWebDriver driver = new ChromeDriver(options);
driver.Navigate().GoToUrl("https://www.selenium.dev/");
}
}
proxy = Selenium::WebDriver::Proxy.new(http: '<HOST:PORT>')
cap   = Selenium::WebDriver::Remote::Capabilities.chrome(proxy: proxy)

driver = Selenium::WebDriver.for(:chrome, capabilities: cap)
driver.get('http://google.com')
let webdriver = require('selenium-webdriver');
let chrome = require('selenium-webdriver/chrome');
let proxy = require('selenium-webdriver/proxy');
let opts = new chrome.Options();

(async function example() {
opts.setProxy(proxy.manual({http: '<HOST:PORT>'}));
let driver = new webdriver.Builder()
.forBrowser('chrome')
.setChromeOptions(opts)
.build();
try {
await driver.get("https://selenium.dev");
}
finally {
await driver.quit();
}
}());
import org.openqa.selenium.Proxy
import org.openqa.selenium.WebDriver
import org.openqa.selenium.chrome.ChromeDriver
import org.openqa.selenium.chrome.ChromeOptions

class proxyTest {
fun main() {

        val proxy = Proxy()
        proxy.setHttpProxy("<HOST:PORT>")
        val options = ChromeOptions()
        options.setCapability("proxy", proxy)
        val driver: WebDriver = ChromeDriver(options)
        driver["https://www.google.com/"]
        driver.manage().window().maximize()
        driver.quit()
    }
}

2.2.2 - Capabilities specific to Chromium browsers

These capabilities are specific to Chromium based browsers.

These Capabilities apply to:

  • Chrome
  • Chromium
  • Edge

2.2.3 - Capabilities specific to Firefox browser

These capabilities are specific to Firefox.

Define Capabilities using FirefoxOptions

FirefoxOptions is the new way to define capabilities for the Firefox browser and should generally be used in preference to DesiredCapabilities.

FirefoxOptions options = new FirefoxOptions();
options.addPreference("network.proxy.type", 0);
driver = new RemoteWebDriver(options);
  
from selenium.webdriver.firefox.options import Options
options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
  
var options = new FirefoxOptions();
options.Proxy.Kind = ProxyKind.Direct;
var driver = new FirefoxDriver(options);
  
require 'selenium-webdriver'
opts = Selenium::WebDriver::Firefox::Options.new(args: ['-headless'])
driver = Selenium::WebDriver.for(:firefox, options: opts)
  
const { Builder } = require("selenium-webdriver");
const firefox = require('selenium-webdriver/firefox');

const options = new firefox.Options();
options.headless();
const driver = new Builder()
    .forBrowser('firefox')
    .setFirefoxOptions(options)
    .build();
  
val options = new FirefoxOptions()
options.addPreference("network.proxy.type", 0)
driver = RemoteWebDriver(options)
  

Setting a custom profile

It is possible to create a custom profile for Firefox as demonstrated below.

FirefoxProfile profile = new FirefoxProfile();
FirefoxOptions options = new FirefoxOptions();
options.setProfile(profile);
driver = new RemoteWebDriver(options);
  
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
options=Options()
firefox_profile = FirefoxProfile()
firefox_profile.set_preference("javascript.enabled", False)
options.profile = firefox_profile
  
var options = new FirefoxOptions();
var profile = new FirefoxProfile();
options.Profile = profile;
var driver = new RemoteWebDriver(options);
  
profile = Selenium::WebDriver::Firefox::Profile.new
profile['browser.download.dir'] = "/tmp/webdriver-downloads"
options = Selenium::WebDriver::Firefox::Options.new(profile: profile)
driver = Selenium::WebDriver.for :firefox, options: options
  
const { Builder } = require("selenium-webdriver");
const firefox = require('selenium-webdriver/firefox');

const options = new firefox.Options();
let profile = '/path to custom profile';
options.setProfile(profile);
const driver = new Builder()
    .forBrowser('firefox')
    .setFirefoxOptions(options)
    .build();
  
val options = FirefoxOptions()
options.profile = FirefoxProfile()
driver = RemoteWebDriver(options)
  

2.2.4 - Capabilities specific to Internet Explorer browser

These capabilities are specific to Internet Explorer.

fileUploadDialogTimeout

In some environments, Internet Explorer may timeout when opening the File Upload dialog. IEDriver has a default timeout of 1000ms, but you can increase the timeout using the fileUploadDialogTimeout capability.

InternetExplorerOptions options = new InternetExplorerOptions();
options.waitForUploadDialogUpTo(Duration.ofSeconds(2));
WebDriver driver = new RemoteWebDriver(options);
  
from selenium import webdriver

options = webdriver.IeOptions()
options.file_upload_dialog_timeout = 2000
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
var options = new InternetExplorerOptions();
options.FileUploadDialogTimeout = TimeSpan.FromMilliseconds(2000);
var driver = new RemoteWebDriver(options);
  
options = Selenium::WebDriver::IE::Options.new
options.file_upload_dialog_timeout = 2000
driver = Selenium::WebDriver.for(:ie, options: options)
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options().fileUploadDialogTimeout(2000);
let driver = await Builder()
          .setIeOptions(options)
          .build(); 
  
val options = InternetExplorerOptions()
options.waitForUploadDialogUpTo(Duration.ofSeconds(2))
val driver = RemoteWebDriver(options)
  

ensureCleanSession

When set to true, this capability clears the Cache, Browser History and Cookies for all running instances of InternetExplorer including those started manually or by the driver. By default, it is set to false.

Using this capability will cause performance drop while launching the browser, as the driver will wait until the cache gets cleared before launching the IE browser.

This capability accepts a Boolean value as parameter.

InternetExplorerOptions options = new InternetExplorerOptions();
options.destructivelyEnsureCleanSession();
WebDriver driver = new RemoteWebDriver(options);
  
from selenium import webdriver

options = webdriver.IeOptions()
options.ensure_clean_session = True
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
var options = new InternetExplorerOptions();
options.EnsureCleanSession = true;
var driver = new RemoteWebDriver(options);
  
options = Selenium::WebDriver::IE::Options.new
options.ensure_clean_session = true
driver = Selenium::WebDriver.for(:ie, options: options)
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options().ensureCleanSession(true);
let driver = await Builder()
          .setIeOptions(options)
          .build(); 
  
val options = InternetExplorerOptions()
options.destructivelyEnsureCleanSession()
val driver = RemoteWebDriver(options)
  

ignoreZoomSetting

InternetExplorer driver expects the browser zoom level to be 100%, else the driver will throw an exception. This default behaviour can be disabled by setting the ignoreZoomSetting to true.

This capability accepts a Boolean value as parameter.

InternetExplorerOptions options = new InternetExplorerOptions();
options.ignoreZoomSettings();
WebDriver driver = new RemoteWebDriver(options);
  
from selenium import webdriver

options = webdriver.IeOptions()
options.ignore_zoom_level = True
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
var options = new InternetExplorerOptions();
options.IgnoreZoomLevel = true;
var driver = new RemoteWebDriver(options);
  
options = Selenium::WebDriver::IE::Options.new
options.ignore_zoom_level = true
driver = Selenium::WebDriver.for(:ie, options: options)
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options().ignoreZoomSetting(true);
let driver = await Builder()
          .setIeOptions(options)
          .build(); 
  
val options = InternetExplorerOptions()
options.ignoreZoomSettings()
val driver = RemoteWebDriver(options)
  

ignoreProtectedModeSettings

Whether to skip the Protected Mode check while launching a new IE session.

If not set and Protected Mode settings are not same for all zones, an exception will be thrown by the driver.

If capability is set to true, tests may become flaky, unresponsive, or browsers may hang. However, this is still by far a second-best choice, and the first choice should always be to actually set the Protected Mode settings of each zone manually. If a user is using this property, only a “best effort” at support will be given.

This capability accepts a Boolean value as parameter.

InternetExplorerOptions options = new InternetExplorerOptions();
options.introduceFlakinessByIgnoringSecurityDomains();
WebDriver driver = new RemoteWebDriver(options);
  
from selenium import webdriver

options = webdriver.IeOptions()
options.ignore_protected_mode_settings = True
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
var options = new InternetExplorerOptions();
options.IntroduceInstabilityByIgnoringProtectedModeSettings = true;
var driver = new RemoteWebDriver(options);
  
options = Selenium::WebDriver::IE::Options.new
options.ignore_protected_mode_settings = true
driver = Selenium::WebDriver.for(:ie, options: options)
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options().introduceFlakinessByIgnoringProtectedModeSettings(true);
let driver = await Builder()
          .setIeOptions(options)
          .build(); 
  
val options = InternetExplorerOptions()
options.introduceFlakinessByIgnoringSecurityDomains()
val driver = RemoteWebDriver(options)
  

silent

When set to true, this capability suppresses the diagnostic output of the IEDriverServer.

This capability accepts a Boolean value as parameter.

InternetExplorerOptions options = new InternetExplorerOptions();
options.setCapability("silent", true);
WebDriver driver = new InternetExplorerDriver(options);
  
from selenium import webdriver

options = webdriver.IeOptions()
options.set_capability("silent", True)
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
InternetExplorerOptions options = new InternetExplorerOptions();
options.AddAdditionalInternetExplorerOption("silent", true);
IWebDriver driver = new InternetExplorerDriver(options);
  
    # Please raise a PR to add code sample
  
const {Builder,By, Capabilities} = require('selenium-webdriver');
let caps = Capabilities.ie();
caps.set('silent', true);

(async function example() {
    let driver = await new Builder()
        .forBrowser('internet explorer')
        .withCapabilities(caps)
        .build();
    try {
        await driver.get('http://www.google.com/ncr');
    }
    finally {
        await driver.quit();
    }
})();
  
import org.openqa.selenium.Capabilities
import org.openqa.selenium.ie.InternetExplorerDriver
import org.openqa.selenium.ie.InternetExplorerOptions

fun main() {
    val options = InternetExplorerOptions()
    options.setCapability("silent", true)
    val driver = InternetExplorerDriver(options)
    try {
        driver.get("https://google.com/ncr")
        val caps = driver.getCapabilities()
        println(caps)
    } finally {
        driver.quit()
    }
}
  

Command-Line Options

Internet Explorer includes several command-line options that enable you to troubleshoot and configure the browser.

The following describes few supported command-line options

  • -private : Used to start IE in private browsing mode. This works for IE 8 and later versions.

  • -k : Starts Internet Explorer in kiosk mode. The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.

  • -extoff : Starts IE in no add-on mode. This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.

Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.

import org.openqa.selenium.Capabilities;
import org.openqa.selenium.ie.InternetExplorerDriver;
import org.openqa.selenium.ie.InternetExplorerOptions;

public class ieTest {
    public static void main(String[] args) {
        InternetExplorerOptions options = new InternetExplorerOptions();
        options.useCreateProcessApiToLaunchIe();
        options.addCommandSwitches("-k");
        InternetExplorerDriver driver = new InternetExplorerDriver(options);
        try {
            driver.get("https://google.com/ncr");
            Capabilities caps = driver.getCapabilities();
            System.out.println(caps);
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver

options = webdriver.IeOptions()
options.add_argument('-private')
options.force_create_process_api = True
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
using System;
using OpenQA.Selenium;
using OpenQA.Selenium.IE;

namespace ieTest {
 class Program {
  static void Main(string[] args) {
   InternetExplorerOptions options = new InternetExplorerOptions();
   options.ForceCreateProcessApi = true;
   options.BrowserCommandLineArguments = "-k";
   IWebDriver driver = new InternetExplorerDriver(options);
   driver.Url = "https://google.com/ncr";
  }
 }
}
  
require 'selenium-webdriver'
options = Selenium::WebDriver::IE::Options.new
options.force_create_process_api = true
options.add_argument('-k')
driver = Selenium::WebDriver.for(:ie, options: options)

begin
  driver.get 'https://google.com'
  puts(driver.capabilities.to_json)
ensure
  driver.quit
end
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options();
options.addBrowserCommandSwitches('-k');
options.addBrowserCommandSwitches('-private');
options.forceCreateProcessApi(true);

driver = await env.builder()
          .setIeOptions(options)
          .build();
  
import org.openqa.selenium.Capabilities
import org.openqa.selenium.ie.InternetExplorerDriver
import org.openqa.selenium.ie.InternetExplorerOptions

fun main() {
    val options = InternetExplorerOptions()
    options.useCreateProcessApiToLaunchIe()
    options.addCommandSwitches("-k")
    val driver = InternetExplorerDriver(options)
    try {
        driver.get("https://google.com/ncr")
        val caps = driver.getCapabilities()
        println(caps)
    } finally {
        driver.quit()
    }
}
  

forceCreateProcessApi

Forces launching Internet Explorer using the CreateProcess API. The default value is false.

For IE 8 and above, this option requires the “TabProcGrowth” registry value to be set to 0.

import org.openqa.selenium.Capabilities;
import org.openqa.selenium.ie.InternetExplorerDriver;
import org.openqa.selenium.ie.InternetExplorerOptions;

public class ieTest {
    public static void main(String[] args) {
        InternetExplorerOptions options = new InternetExplorerOptions();
        options.useCreateProcessApiToLaunchIe();
        InternetExplorerDriver driver = new InternetExplorerDriver(options);
        try {
            driver.get("https://google.com/ncr");
            Capabilities caps = driver.getCapabilities();
            System.out.println(caps);
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver

options = webdriver.IeOptions()
options.force_create_process_api = True
driver = webdriver.Ie(options=options)

driver.get("http://www.google.com")

driver.quit()
  
using System;
using OpenQA.Selenium;
using OpenQA.Selenium.IE;

namespace ieTest {
 class Program {
  static void Main(string[] args) {
   InternetExplorerOptions options = new InternetExplorerOptions();
   options.ForceCreateProcessApi = true;
   IWebDriver driver = new InternetExplorerDriver(options);
   driver.Url = "https://google.com/ncr";
  }
 }
}
  
require 'selenium-webdriver'
options = Selenium::WebDriver::IE::Options.new
options.force_create_process_api = true
driver = Selenium::WebDriver.for(:ie, options: options)

begin
  driver.get 'https://google.com'
  puts(driver.capabilities.to_json)
ensure
  driver.quit
end
  
const ie = require('selenium-webdriver/ie');
let options = new ie.Options();
options.forceCreateProcessApi(true);

driver = await env.builder()
          .setIeOptions(options)
          .build();
  
import org.openqa.selenium.Capabilities
import org.openqa.selenium.ie.InternetExplorerDriver
import org.openqa.selenium.ie.InternetExplorerOptions

fun main() {
    val options = InternetExplorerOptions()
    options.useCreateProcessApiToLaunchIe()
    val driver = InternetExplorerDriver(options)
    try {
        driver.get("https://google.com/ncr")
        val caps = driver.getCapabilities()
        println(caps)
    } finally {
        driver.quit()
    }
}
  

2.2.5 - Capabilities specific to Safari browser

These capabilities are specific to Safari.

2.3 - Browser

Get browser information

Get title

You can read the current page title from the browser:

driver.getTitle();
driver.title
driver.Title;
driver.title
await driver.getTitle();
driver.title

Get current URL

You can read the current URL from the browser’s address bar using:

driver.getCurrentUrl();
driver.current_url
driver.Url;
driver.current_url
await driver.getCurrentUrl();
driver.currentUrl

2.3.1 - Browser navigation

The first thing you will want to do after launching a browser is to open your website. This can be achieved in a single line:

//Convenient
driver.get("https://selenium.dev");

//Longer way
driver.navigate().to("https://selenium.dev");
  
driver.get("https://selenium.dev")
  
driver.Navigate().GoToUrl(@"https://selenium.dev");
  
driver.navigate.to 'https://selenium.dev'
  
await driver.get('https://selenium.dev');
  
//Convenient
driver.get("https://selenium.dev")

//Longer way
driver.navigate().to("https://selenium.dev")
  

Back

Pressing the browser’s back button:

driver.navigate().back();
driver.back()
driver.Navigate().Back();
driver.navigate.back
await driver.navigate().back();
driver.navigate().back() 

Forward

Pressing the browser’s forward button:

driver.navigate().forward();
driver.forward()
driver.Navigate().Forward();
driver.navigate.forward
await driver.navigate().forward();
driver.navigate().forward()

Refresh

Refresh the current page:

driver.navigate().refresh();
driver.refresh()
driver.Navigate().Refresh();
driver.navigate.refresh
await driver.navigate().refresh();
driver.navigate().refresh()

2.3.2 - JavaScript alerts, prompts and confirmations

WebDriver provides an API for working with the three types of native popup messages offered by JavaScript. These popups are styled by the browser and offer limited customisation.

Alerts

The simplest of these is referred to as an alert, which shows a custom message, and a single button which dismisses the alert, labelled in most browsers as OK. It can also be dismissed in most browsers by pressing the close button, but this will always do the same thing as the OK button. See an example alert.

WebDriver can get the text from the popup and accept or dismiss these alerts.

//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click();

//Wait for the alert to be displayed and store it in a variable
Alert alert = wait.until(ExpectedConditions.alertIsPresent());

//Store the alert text in a variable
String text = alert.getText();

//Press the OK button
alert.accept();
  
# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See an example alert").click()

# Wait for the alert to be displayed and store it in a variable
alert = wait.until(expected_conditions.alert_is_present())

# Store the alert text in a variable
text = alert.text

# Press the OK button
alert.accept()
  
//Click the link to activate the alert
driver.FindElement(By.LinkText("See an example alert")).Click();

//Wait for the alert to be displayed and store it in a variable
IAlert alert = wait.Until(ExpectedConditions.AlertIsPresent());

//Store the alert text in a variable
string text = alert.Text;

//Press the OK button
alert.Accept();
  
# Click the link to activate the alert
driver.find_element(:link_text, 'See an example alert').click

# Store the alert reference in a variable
alert = driver.switch_to.alert

# Store the alert text in a variable
alert_text = alert.text

# Press on OK button
alert.accept
  
//Click the link to activate the alert
await driver.findElement(By.linkText('See an example alert')).click();

// Wait for the alert to be displayed
await driver.wait(until.alertIsPresent());

// Store the alert in a variable
let alert = await driver.switchTo().alert();

//Store the alert text in a variable
let alertText = await alert.getText();

//Press the OK button
await alert.accept();

// Note: To use await, the above code should be inside an async function
  
//Click the link to activate the alert
driver.findElement(By.linkText("See an example alert")).click()

//Wait for the alert to be displayed and store it in a variable
val alert = wait.until(ExpectedConditions.alertIsPresent())

//Store the alert text in a variable
val text = alert.getText()

//Press the OK button
alert.accept()
  

Confirm

A confirm box is similar to an alert, except the user can also choose to cancel the message. See a sample confirm.

This example also shows a different approach to storing an alert:

//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click();

//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent());

//Store the alert in a variable
Alert alert = driver.switchTo().alert();

//Store the alert in a variable for reuse
String text = alert.getText();

//Press the Cancel button
alert.dismiss();
  
# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See a sample confirm").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = driver.switch_to.alert

# Store the alert text in a variable
text = alert.text

# Press the Cancel button
alert.dismiss()
  
//Click the link to activate the alert
driver.FindElement(By.LinkText("See a sample confirm")).Click();

//Wait for the alert to be displayed
wait.Until(ExpectedConditions.AlertIsPresent());

//Store the alert in a variable
IAlert alert = driver.SwitchTo().Alert();

//Store the alert in a variable for reuse
string text = alert.Text;

//Press the Cancel button
alert.Dismiss();
  
# Click the link to activate the alert
driver.find_element(:link_text, 'See a sample confirm').click

# Store the alert reference in a variable
alert = driver.switch_to.alert

# Store the alert text in a variable
alert_text = alert.text

# Press on Cancel button
alert.dismiss
  
//Click the link to activate the alert
await driver.findElement(By.linkText('See a sample confirm')).click();

// Wait for the alert to be displayed
await driver.wait(until.alertIsPresent());

// Store the alert in a variable
let alert = await driver.switchTo().alert();

//Store the alert text in a variable
let alertText = await alert.getText();

//Press the Cancel button
await alert.dismiss();

// Note: To use await, the above code should be inside an async function
  
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample confirm")).click()

//Wait for the alert to be displayed
wait.until(ExpectedConditions.alertIsPresent())

//Store the alert in a variable
val alert = driver.switchTo().alert()

//Store the alert in a variable for reuse
val text = alert.text

//Press the Cancel button
alert.dismiss()
  

Prompt

Prompts are similar to confirm boxes, except they also include a text input. Similar to working with form elements, you can use WebDriver’s send keys to fill in a response. This will completely replace the placeholder text. Pressing the cancel button will not submit any text. See a sample prompt.

//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click();

//Wait for the alert to be displayed and store it in a variable
Alert alert = wait.until(ExpectedConditions.alertIsPresent());

//Type your message
alert.sendKeys("Selenium");

//Press the OK button
alert.accept();
  
# Click the link to activate the alert
driver.find_element(By.LINK_TEXT, "See a sample prompt").click()

# Wait for the alert to be displayed
wait.until(expected_conditions.alert_is_present())

# Store the alert in a variable for reuse
alert = Alert(driver)

# Type your message
alert.send_keys("Selenium")

# Press the OK button
alert.accept()
  
//Click the link to activate the alert
driver.FindElement(By.LinkText("See a sample prompt")).Click();

//Wait for the alert to be displayed and store it in a variable
IAlert alert = wait.Until(ExpectedConditions.AlertIsPresent());

//Type your message
alert.SendKeys("Selenium");

//Press the OK button
alert.Accept();
  
# Click the link to activate the alert
driver.find_element(:link_text, 'See a sample prompt').click

# Store the alert reference in a variable
alert = driver.switch_to.alert

# Type a message
alert.send_keys("selenium")

# Press on Ok button
alert.accept
  
//Click the link to activate the alert
await driver.findElement(By.linkText('See a sample prompt')).click();

// Wait for the alert to be displayed
await driver.wait(until.alertIsPresent());

// Store the alert in a variable
let alert = await driver.switchTo().alert();

//Type your message
await alert.sendKeys("Selenium");

//Press the OK button
await alert.accept();

//Note: To use await, the above code should be inside an async function
  
//Click the link to activate the alert
driver.findElement(By.linkText("See a sample prompt")).click()

//Wait for the alert to be displayed and store it in a variable
val alert = wait.until(ExpectedConditions.alertIsPresent())

//Type your message
alert.sendKeys("Selenium")

//Press the OK button
alert.accept()
  

2.3.3 - Working with cookies

A cookie is a small piece of data that is sent from a website and stored in your computer. Cookies are mostly used to recognise the user and load the stored information.

WebDriver API provides a way to interact with cookies with built-in methods:

It is used to add a cookie to the current browsing context. Add Cookie only accepts a set of defined serializable JSON object. Here is the link to the list of accepted JSON key values

First of all, you need to be on the domain that the cookie will be valid for. If you are trying to preset cookies before you start interacting with a site and your homepage is large / takes a while to load an alternative is to find a smaller page on the site (typically the 404 page is small, e.g. http://example.com/some404page)

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;

public class addCookie {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        try {
            driver.get("http://www.example.com");

            // Adds the cookie into current browser context
            driver.manage().addCookie(new Cookie("key", "value"));
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver

driver = webdriver.Chrome()

driver.get("http://www.example.com")

# Adds the cookie into current browser context
driver.add_cookie({"name": "key", "value": "value"})
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace AddCookie {
 class AddCookie {
  public static void Main(string[] args) {
   IWebDriver driver = new ChromeDriver();
   try {
    // Navigate to Url
    driver.Navigate().GoToUrl("https://example.com");

    // Adds the cookie into current browser context
    driver.Manage().Cookies.AddCookie(new Cookie("key", "value"));
   } finally {
    driver.Quit();
   }
  }
 }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  
  # Adds the cookie into current browser context
  driver.manage.add_cookie(name: "key", value: "value")
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');
(async function example() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();

  await driver.get('https://www.example.com');
  // set a cookie on the current domain
  await driver.manage().addCookie({name:'key', value: 'value'});
  await driver.quit();
})();
import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("https://example.com")

        // Adds the cookie into current browser context
        driver.manage().addCookie(Cookie("key", "value"))
    } finally {
        driver.quit()
    }
}
  

It returns the serialized cookie data matching with the cookie name among all associated cookies.

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;

public class getCookieNamed {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        try {
            driver.get("http://www.example.com");
            driver.manage().addCookie(new Cookie("foo", "bar"));

            // Get cookie details with named cookie 'foo'
            Cookie cookie1 = driver.manage().getCookieNamed("foo");
            System.out.println(cookie1);
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver

driver = webdriver.Chrome()

# Navigate to url
driver.get("http://www.example.com")

# Adds the cookie into current browser context
driver.add_cookie({"name": "foo", "value": "bar"})

# Get cookie details with named cookie 'foo'
print(driver.get_cookie("foo"))
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace GetCookieNamed {
 class GetCookieNamed {
  public static void Main(string[] args) {
   IWebDriver driver = new ChromeDriver();
   try {
    // Navigate to Url
    driver.Navigate().GoToUrl("https://example.com");
    driver.Manage().Cookies.AddCookie(new Cookie("foo", "bar"));

    // Get cookie details with named cookie 'foo'
    var cookie = driver.Manage().Cookies.GetCookieNamed("foo");
    System.Console.WriteLine(cookie);
   } finally {
    driver.Quit();
   }
  }
 }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  driver.manage.add_cookie(name: "foo", value: "bar")

  # Get cookie details with named cookie 'foo'
  puts driver.manage.cookie_named('foo')
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');

(async function getCookieByName() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();

  await driver.get('https://www.example.com');
  // set a cookie on the current domain
  await driver.manage().addCookie({name:'foo', value: 'bar'});
  // Get cookie details with named cookie 'foo'
  await driver.manage().getCookie('foo').then(function (cookie) {
    console.log('cookie details => ', cookie);
  });
  await driver.quit();
})();
import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("https://example.com")
        driver.manage().addCookie(Cookie("foo", "bar"))

        // Get cookie details with named cookie 'foo'
        val cookie = driver.manage().getCookieNamed("foo")
        println(cookie)
    } finally {
        driver.quit()
    }
}  
  

Get All Cookies

It returns a ‘successful serialized cookie data’ for current browsing context. If browser is no longer available it returns error.

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import java.util.Set;

public class getAllCookies {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        try {
            driver.get("http://www.example.com");
            // Add few cookies
            driver.manage().addCookie(new Cookie("test1", "cookie1"));
            driver.manage().addCookie(new Cookie("test2", "cookie2"));

            // Get All available cookies
            Set<Cookie> cookies = driver.manage().getCookies();
            System.out.println(cookies);
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver

driver = webdriver.Chrome()

# Navigate to url
driver.get("http://www.example.com")

driver.add_cookie({"name": "test1", "value": "cookie1"})
driver.add_cookie({"name": "test2", "value": "cookie2"})

# Get all available cookies
print(driver.get_cookies())
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace GetAllCookies {
 class GetAllCookies {
  public static void Main(string[] args) {
   IWebDriver driver = new ChromeDriver();
   try {
    // Navigate to Url
    driver.Navigate().GoToUrl("https://example.com");
    driver.Manage().Cookies.AddCookie(new Cookie("test1", "cookie1"));
    driver.Manage().Cookies.AddCookie(new Cookie("test2", "cookie2"));

    // Get All available cookies
    var cookies = driver.Manage().Cookies.AllCookies;
   } finally {
    driver.Quit();
   }
  }
 }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  driver.manage.add_cookie(name: "test1", value: "cookie1")
  driver.manage.add_cookie(name: "test2", value: "cookie2")

  # Get all available cookies
  puts driver.manage.all_cookies
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');

(async function getAllCookies() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();

  await driver.get('https://www.example.com');
  // Add few cookies
  await driver.manage().addCookie({name:'test1', value:'cookie1'});
  await driver.manage().addCookie({name:'test2', value:'cookie2'});
  // Get all Available cookies
  await driver.manage().getCookies().then(function (cookies) {
    console.log('cookie details => ', cookies);
  });
  await driver.quit()
})();
import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("https://example.com")
        driver.manage().addCookie(Cookie("test1", "cookie1"))
        driver.manage().addCookie(Cookie("test2", "cookie2"))

        // Get All available cookies
        val cookies = driver.manage().cookies
        println(cookies)
    } finally {
        driver.quit()
    }
}  
  

It deletes the cookie data matching with the provided cookie name.

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;

public class deleteCookie {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        try {
            driver.get("http://www.example.com");
            driver.manage().addCookie(new Cookie("test1", "cookie1"));
            Cookie cookie1 = new Cookie("test2", "cookie2");
            driver.manage().addCookie(cookie1);

            // delete a cookie with name 'test1'
            driver.manage().deleteCookieNamed("test1");

            /*
             Selenium Java bindings also provides a way to delete
             cookie by passing cookie object of current browsing context
             */
            driver.manage().deleteCookie(cookie1);
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver
driver = webdriver.Chrome()

# Navigate to url
driver.get("http://www.example.com")
driver.add_cookie({"name": "test1", "value": "cookie1"})
driver.add_cookie({"name": "test2", "value": "cookie2"})

# Delete a cookie with name 'test1'
driver.delete_cookie("test1")
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace DeleteCookie {
 class DeleteCookie {
  public static void Main(string[] args) {
   IWebDriver driver = new ChromeDriver();
   try {
    // Navigate to Url
    driver.Navigate().GoToUrl("https://example.com");
    driver.Manage().Cookies.AddCookie(new Cookie("test1", "cookie1"));
    var cookie = new Cookie("test2", "cookie2");
    driver.Manage().Cookies.AddCookie(cookie);

    // delete a cookie with name 'test1'	
    driver.Manage().Cookies.DeleteCookieNamed("test1");

    // Selenium .net bindings also provides a way to delete
    // cookie by passing cookie object of current browsing context
    driver.Manage().Cookies.DeleteCookie(cookie);
   } finally {
    driver.Quit();
   }
  }
 }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  driver.manage.add_cookie(name: "test1", value: "cookie1")
  driver.manage.add_cookie(name: "test2", value: "cookie2")

  # delete a cookie with name 'test1'
  driver.manage.delete_cookie('test1')
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');

(async function deleteCookie() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();
  try {
    await driver.get('https://www.example.com');
    // Add few cookies
    await driver.manage().addCookie({name: 'test1', value: 'cookie1'});
    await driver.manage().addCookie({name: 'test2', value: 'cookie2'});

    // Delete a cookie with name 'test1'
    await driver.manage().deleteCookie('test1');

    // Get all Available cookies
    await driver.manage().getCookies().then(function (cookies) {
      console.log('cookie details => ', cookies);
    });
    await driver.quit();
  } catch (e){
    console.log(e)
  }
})();

import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("https://example.com")
        driver.manage().addCookie(Cookie("test1", "cookie1"))
        val cookie1 = Cookie("test2", "cookie2")
        driver.manage().addCookie(cookie1)

        // delete a cookie with name 'test1'
        driver.manage().deleteCookieNamed("test1")
        
        // delete cookie by passing cookie object of current browsing context.
        driver.manage().deleteCookie(cookie1)
    } finally {
        driver.quit()
    }
}  
  

Delete All Cookies

It deletes all the cookies of the current browsing context.

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;

public class deleteAllCookies {
    public static void main(String[] args) {
        WebDriver driver = new ChromeDriver();
        try {
            driver.get("http://www.example.com");
            driver.manage().addCookie(new Cookie("test1", "cookie1"));
            driver.manage().addCookie(new Cookie("test2", "cookie2"));

            // deletes all cookies
            driver.manage().deleteAllCookies();
        } finally {
            driver.quit();
        }
    }
}
  
from selenium import webdriver
driver = webdriver.Chrome()

# Navigate to url
driver.get("http://www.example.com")
driver.add_cookie({"name": "test1", "value": "cookie1"})
driver.add_cookie({"name": "test2", "value": "cookie2"})

#  Deletes all cookies
driver.delete_all_cookies()
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace DeleteAllCookies {
 class DeleteAllCookies {
  public static void Main(string[] args) {
   IWebDriver driver = new ChromeDriver();
   try {
    // Navigate to Url
    driver.Navigate().GoToUrl("https://example.com");
    driver.Manage().Cookies.AddCookie(new Cookie("test1", "cookie1"));
    driver.Manage().Cookies.AddCookie(new Cookie("test2", "cookie2"));

    // deletes all cookies
    driver.Manage().Cookies.DeleteAllCookies();
   } finally {
    driver.Quit();
   }
  }
 }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  driver.manage.add_cookie(name: "test1", value: "cookie1")
  driver.manage.add_cookie(name: "test2", value: "cookie2")

  # deletes all cookies
  driver.manage.delete_all_cookies
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');

(async function example() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();

  try {
    await driver.get('https://www.example.com');
    // Add few cookies
    await driver.manage().addCookie({name: 'test1', value: 'cookie1'});
    await driver.manage().addCookie({name: 'test2', value: 'cookie2'});
    // Delete all cookies
    await driver.manage().deleteAllCookies();
    await driver.quit();
  } catch(e) {
    console.log(e)
  }
})();
import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("https://example.com")
        driver.manage().addCookie(Cookie("test1", "cookie1"))
        driver.manage().addCookie(Cookie("test2", "cookie2"))

        // deletes all cookies
        driver.manage().deleteAllCookies()
    } finally {
        driver.quit()
    }
}  
  

It allows a user to instruct browsers to control whether cookies are sent along with the request initiated by third party sites. It is introduced to prevent CSRF (Cross-Site Request Forgery) attacks.

Same-Site cookie attribute accepts two parameters as instructions

Strict:

When the sameSite attribute is set as Strict, the cookie will not be sent along with requests initiated by third party websites.

Lax:

When you set a cookie sameSite attribute to Lax, the cookie will be sent along with the GET request initiated by third party website.

Note: As of now this feature is landed in chrome(80+version), Firefox(79+version) and works with Selenium 4 and later versions.

import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;

public class cookieTest {
  public static void main(String[] args) {
    WebDriver driver = new ChromeDriver();
    try {
      driver.get("http://www.example.com");
      Cookie cookie = new Cookie.Builder("key", "value").sameSite("Strict").build();
      Cookie cookie1 = new Cookie.Builder("key", "value").sameSite("Lax").build();
      driver.manage().addCookie(cookie);
      driver.manage().addCookie(cookie1);
      System.out.println(cookie.getSameSite());
      System.out.println(cookie1.getSameSite());
    } finally {
      driver.quit();
    }
  }
}
  
from selenium import webdriver

driver = webdriver.Chrome()

driver.get("http://www.example.com")
# Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'
driver.add_cookie({"name": "foo", "value": "value", 'sameSite': 'Strict'})
driver.add_cookie({"name": "foo1", "value": "value", 'sameSite': 'Lax'})
cookie1 = driver.get_cookie('foo')
cookie2 = driver.get_cookie('foo1')
print(cookie1)
print(cookie2)
  
using OpenQA.Selenium;
using OpenQA.Selenium.Chrome;

namespace SameSiteCookie {
  class SameSiteCookie {
    static void Main(string[] args) {
      IWebDriver driver = new ChromeDriver();
      try {
        driver.Navigate().GoToUrl("http://www.example.com");

        var cookie1Dictionary = new System.Collections.Generic.Dictionary<string, object>() {
          { "name", "test1" }, { "value", "cookie1" }, { "sameSite", "Strict" } };
        var cookie1 = Cookie.FromDictionary(cookie1Dictionary);

        var cookie2Dictionary = new System.Collections.Generic.Dictionary<string, object>() {
          { "name", "test2" }, { "value", "cookie2" }, { "sameSite", "Lax" } };
        var cookie2 = Cookie.FromDictionary(cookie2Dictionary);

        driver.Manage().Cookies.AddCookie(cookie1);
        driver.Manage().Cookies.AddCookie(cookie2);

        System.Console.WriteLine(cookie1.SameSite);
        System.Console.WriteLine(cookie2.SameSite);
      } finally {
        driver.Quit();
      }
    }
  }
}
  
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome

begin
  driver.get 'https://www.example.com'
  # Adds the cookie into current browser context with sameSite 'Strict' (or) 'Lax'
  driver.manage.add_cookie(name: "foo", value: "bar", same_site: "Strict")
  driver.manage.add_cookie(name: "foo1", value: "bar", same_site: "Lax")
  puts driver.manage.cookie_named('foo')
  puts driver.manage.cookie_named('foo1')
ensure
  driver.quit
end
  
const {Builder} = require('selenium-webdriver');

(async function sameSiteCookieExample() {
  let driver = new Builder()
    .forBrowser('chrome')
    .build();

  try {
    await driver.get('https://www.example.com');
    // set a cookie on the current domain with sameSite 'Strict' (or) 'Lax'
    await driver.manage().addCookie({name: 'key', value: 'value', sameSite: 'Strict'});
    await driver.manage().addCookie({name: 'key', value: 'value', sameSite: 'Lax'});
    await driver.quit();
  } catch (e) {
    console.log(e);
  }
})();
import org.openqa.selenium.Cookie
import org.openqa.selenium.chrome.ChromeDriver

fun main() {
    val driver = ChromeDriver()
    try {
        driver.get("http://www.example.com")
        val cookie = Cookie.Builder("key", "value").sameSite("Strict").build()
        val cookie1 = Cookie.Builder("key", "value").sameSite("Lax").build()
        driver.manage().addCookie(cookie)
        driver.manage().addCookie(cookie1)
        println(cookie.getSameSite())
        println(cookie1.getSameSite())
    } finally {
        driver.quit()
    }
}
  

2.3.4 - Working with iFrames and frames

Frames are a now deprecated means of building a site layout from multiple documents on the same domain. You are unlikely to work with them unless you are working with an pre HTML5 webapp. Iframes allow the insertion of a document from an entirely different domain, and are still commonly used.

If you need to work with frames or iframes, WebDriver allows you to work with them in the same way. Consider a button within an iframe. If we inspect the element using the browser development tools, we might see the following:

<div id="modal">
  <iframe id="buttonframe" name="myframe"  src="https://seleniumhq.github.io">
   <button>Click here</button>
 </iframe>
</div>

If it was not for the iframe we would expect to click on the button using something like:

//This won't work
driver.findElement(By.tagName("button")).click();
  
    # This Wont work
driver.find_element(By.TAG_NAME, 'button').click()
  
//This won't work
driver.FindElement(By.TagName("button")).Click();
  
    # This won't work
driver.find_element(:tag_name,'button').click
  
// This won't work
await driver.findElement(By.css('button')).click();
  
//This won't work
driver.findElement(By.tagName("button")).click()
  

However, if there are no buttons outside of the iframe, you might instead get a no such element error. This happens because Selenium is only aware of the elements in the top level document. To interact with the button, we will need to first switch to the frame, in a similar way to how we switch windows. WebDriver offers three ways of switching to a frame.

Using a WebElement

Switching using a WebElement is the most flexible option. You can find the frame using your preferred selector and switch to it.

//Store the web element
WebElement iframe = driver.findElement(By.cssSelector("#modal>iframe"));

//Switch to the frame
driver.switchTo().frame(iframe);

//Now we can click the button
driver.findElement(By.tagName("button")).click();
  
    # Store iframe web element
iframe = driver.find_element(By.CSS_SELECTOR, "#modal > iframe")

    # switch to selected iframe
driver.switch_to.frame(iframe)

    # Now click on button
driver.find_element(By.TAG_NAME, 'button').click()
  
//Store the web element
IWebElement iframe = driver.FindElement(By.CssSelector("#modal>iframe"));

//Switch to the frame
driver.SwitchTo().Frame(iframe);

//Now we can click the button
driver.FindElement(By.TagName("button")).Click();
  
    # Store iframe web element
iframe = driver.find_element(:css,'#modal > iframe')

    # Switch to the frame
driver.switch_to.frame iframe

    # Now, Click on the button
driver.find_element(:tag_name,'button').click
  
// Store the web element
const iframe = driver.findElement(By.css('#modal > iframe'));

// Switch to the frame
await driver.switchTo().frame(iframe);

// Now we can click the button
await driver.findElement(By.css('button')).click();
  
//Store the web element
val iframe = driver.findElement(By.cssSelector("#modal>iframe"))

//Switch to the frame
driver.switchTo().frame(iframe)

//Now we can click the button
driver.findElement(By.tagName("button")).click()
  

Using a name or ID

If your frame or iframe has an id or name attribute, this can be used instead. If the name or ID is not unique on the page, then the first one found will be switched to.

//Using the ID
driver.switchTo().frame("buttonframe");

//Or using the name instead
driver.switchTo().frame("myframe");

//Now we can click the button
driver.findElement(By.tagName("button")).click();
  
    # Switch frame by id
driver.switch_to.frame('buttonframe')

    # Now, Click on the button
driver.find_element(By.TAG_NAME, 'button').click()
  
//Using the ID
driver.SwitchTo().Frame("buttonframe");

//Or using the name instead
driver.SwitchTo().Frame("myframe");

//Now we can click the button
driver.FindElement(By.TagName("button")).Click();
  
    # Switch by ID
driver.switch_to.frame 'buttonframe'

    # Now, Click on the button
driver.find_element(:tag_name,'button').click
  
// Using the ID
await driver.switchTo().frame('buttonframe');

// Or using the name instead
await driver.switchTo().frame('myframe');

// Now we can click the button
await driver.findElement(By.css('button')).click();
  
//Using the ID
driver.switchTo().frame("buttonframe")

//Or using the name instead
driver.switchTo().frame("myframe")

//Now we can click the button
driver.findElement(By.tagName("button")).click()
  

Using an index

It is also possible to use the index of the frame, such as can be queried using window.frames in JavaScript.

// Switches to the second frame
driver.switchTo().frame(1);
  
    # Switch to the second frame
driver.switch_to.frame(1)
  
// Switches to the second frame
driver.SwitchTo().Frame(1);
  
    # switching to second iframe based on index
iframe = driver.find_elements_by_tag_name('iframe')[1]

    # switch to selected iframe
driver.switch_to.frame(iframe)
  
// Switches to the second frame
await driver.switchTo().frame(1);
  
// Switches to the second frame
driver.switchTo().frame(1)
  

Leaving a frame

To leave an iframe or frameset, switch back to the default content like so:

// Return to the top level
driver.switchTo().defaultContent();
  
    # switch back to default content
driver.switch_to.default_content()
  
// Return to the top level
driver.SwitchTo().DefaultContent();
  
    # Return to the top level
driver.switch_to.default_content
  
// Return to the top level
await driver.switchTo().defaultContent();
  
// Return to the top level
driver.switchTo().defaultContent()
  

2.3.5 - Working with windows and tabs

Windows and tabs

Get window handle

WebDriver does not make the distinction between windows and tabs. If your site opens a new tab or window, Selenium will let you work with it using a window handle. Each window has a unique identifier which remains persistent in a single session. You can get the window handle of the current window by using:

driver.getWindowHandle();
driver.current_window_handle
driver.CurrentWindowHandle;
driver.window_handle
await driver.getWindowHandle();
driver.windowHandle

Switching windows or tabs

Clicking a link which opens in a new window will focus the new window or tab on screen, but WebDriver will not know which window the Operating System considers active. To work with the new window you will need to switch to it. If you have only two tabs or windows open, and you know which window you start with, by the process of elimination you can loop over both windows or tabs that WebDriver can see, and switch to the one which is not the original.

However, Selenium 4 provides a new api NewWindow which creates a new tab (or) new window and automatically switches to it.

//Store the ID of the original window
String originalWindow = driver.getWindowHandle();

//Check we don't have other windows open already
assert driver.getWindowHandles().size() == 1;

//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click();

//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2));

//Loop through until we find a new window handle
for (String windowHandle : driver.getWindowHandles()) {
    if(!originalWindow.contentEquals(windowHandle)) {
        driver.switchTo().window(windowHandle);
        break;
    }
}

//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"));
  
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

with webdriver.Firefox() as driver:
    # Open URL
    driver.get("https://seleniumhq.github.io")

    # Setup wait for later
    wait = WebDriverWait(driver, 10)

    # Store the ID of the original window
    original_window = driver.current_window_handle

    # Check we don't have other windows open already
    assert len(driver.window_handles) == 1

    # Click the link which opens in a new window
    driver.find_element(By.LINK_TEXT, "new window").click()

    # Wait for the new window or tab
    wait.until(EC.number_of_windows_to_be(2))

    # Loop through until we find a new window handle
    for window_handle in driver.window_handles:
        if window_handle != original_window:
            driver.switch_to.window(window_handle)
            break

    # Wait for the new tab to finish loading content
    wait.until(EC.title_is("SeleniumHQ Browser Automation"))
  
//Store the ID of the original window
string originalWindow = driver.CurrentWindowHandle;

//Check we don't have other windows open already
Assert.AreEqual(driver.WindowHandles.Count, 1);

//Click the link which opens in a new window
driver.FindElement(By.LinkText("new window")).Click();

//Wait for the new window or tab
wait.Until(wd => wd.WindowHandles.Count == 2);

//Loop through until we find a new window handle
foreach(string window in driver.WindowHandles)
{
    if(originalWindow != window)
    {
        driver.SwitchTo().Window(window);
        break;
    }
}
//Wait for the new tab to finish loading content
wait.Until(wd => wd.Title == "Selenium documentation");
  
    # Store the ID of the original window
original_window = driver.window_handle

    # Check we don't have other windows open already
assert(driver.window_handles.length == 1, 'Expected one window')

    # Click the link which opens in a new window
driver.find_element(link: 'new window').click

    # Wait for the new window or tab
wait.until { driver.window_handles.length == 2 }

    #Loop through until we find a new window handle
driver.window_handles.each do |handle|
    if handle != original_window
        driver.switch_to.window handle
        break
    end
end

    #Wait for the new tab to finish loading content
wait.until { driver.title == 'Selenium documentation'}
  
//Store the ID of the original window
const originalWindow = await driver.getWindowHandle();

//Check we don't have other windows open already
assert((await driver.getAllWindowHandles()).length === 1);

//Click the link which opens in a new window
await driver.findElement(By.linkText('new window')).click();

//Wait for the new window or tab
await driver.wait(
    async () => (await driver.getAllWindowHandles()).length === 2,
    10000
  );

//Loop through until we find a new window handle
const windows = await driver.getAllWindowHandles();
windows.forEach(async handle => {
  if (handle !== originalWindow) {
    await driver.switchTo().window(handle);
  }
});

//Wait for the new tab to finish loading content
await driver.wait(until.titleIs('Selenium documentation'), 10000);
  
//Store the ID of the original window
val originalWindow = driver.getWindowHandle()

//Check we don't have other windows open already
assert(driver.getWindowHandles().size() === 1)

//Click the link which opens in a new window
driver.findElement(By.linkText("new window")).click()

//Wait for the new window or tab
wait.until(numberOfWindowsToBe(2))

//Loop through until we find a new window handle
for (windowHandle in driver.getWindowHandles()) {
    if (!originalWindow.contentEquals(windowHandle)) {
        driver.switchTo().window(windowHandle)
        break
    }
}

//Wait for the new tab to finish loading content
wait.until(titleIs("Selenium documentation"))

  

Create new window (or) new tab and switch

Creates a new window (or) tab and will focus the new window or tab on screen. You don’t need to switch to work with the new window (or) tab. If you have more than two windows (or) tabs opened other than the new window, you can loop over both windows or tabs that WebDriver can see, and switch to the one which is not the original.

Note: This feature works with Selenium 4 and later versions.

// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB);

// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW);
  
    # Opens a new tab and switches to new tab
driver.switch_to.new_window('tab')

    # Opens a new window and switches to new window
driver.switch_to.new_window('window')
  
// Opens a new tab and switches to new tab
driver.SwitchTo().NewWindow(WindowType.Tab)

// Opens a new window and switches to new window
driver.SwitchTo().NewWindow(WindowType.Window)
  
    # Note: The new_window in ruby only opens a new tab (or) Window and will not switch automatically
    # The user has to switch to new tab (or) new window

    # Opens a new tab and switches to new tab
driver.manage.new_window(:tab)

    # Opens a new window and switches to new window
driver.manage.new_window(:window)
  
// Opens a new tab and switches to new tab
await driver.switchTo().newWindow('tab');

// Opens a new window and switches to new window
await driver.switchTo().newWindow('window');

  
// Opens a new tab and switches to new tab
driver.switchTo().newWindow(WindowType.TAB)

// Opens a new window and switches to new window
driver.switchTo().newWindow(WindowType.WINDOW)
  

Closing a window or tab

When you are finished with a window or tab and it is not the last window or tab open in your browser, you should close it and switch back to the window you were using previously. Assuming you followed the code sample in the previous section you will have the previous window handle stored in a variable. Put this together and you will get:

//Close the tab or window
driver.close();

//Switch back to the old tab or window
driver.switchTo().window(originalWindow);
  
    #Close the tab or window
driver.close()

    #Switch back to the old tab or window
driver.switch_to.window(original_window)
  
//Close the tab or window
driver.Close();

//Switch back to the old tab or window
driver.SwitchTo().Window(originalWindow);
  
    #Close the tab or window
driver.close

    #Switch back to the old tab or window
driver.switch_to.window original_window
  
//Close the tab or window
await driver.close();

//Switch back to the old tab or window
await driver.switchTo().window(originalWindow);
  
//Close the tab or window
driver.close()

//Switch back to the old tab or window
driver.switchTo().window(originalWindow)

  

Forgetting to switch back to another window handle after closing a window will leave WebDriver executing on the now closed page, and will trigger a No Such Window Exception. You must switch back to a valid window handle in order to continue execution.

Quitting the browser at the end of a session

When you are finished with the browser session you should call quit, instead of close:

driver.quit();
driver.quit()
driver.Quit();
driver.quit
await driver.quit();
driver.quit()
  • Quit will:
    • Close all the windows and tabs associated with that WebDriver session
    • Close the browser process
    • Close the background driver process
    • Notify Selenium Grid that the browser is no longer in use so it can be used by another session (if you are using Selenium Grid)

Failure to call quit will leave extra background processes and ports running on your machine which could cause you problems later.

Some test frameworks offer methods and annotations which you can hook into to tear down at the end of a test.

/**
 * Example using JUnit
 * https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
 */
@AfterAll
public static void tearDown() {
    driver.quit();
}
  
    # unittest teardown
    # https://docs.python.org/3/library/unittest.html?highlight=teardown#unittest.TestCase.tearDown
def tearDown(self):
    self.driver.quit()
  
/*
    Example using Visual Studio's UnitTesting
    https://msdn.microsoft.com/en-us/library/microsoft.visualstudio.testtools.unittesting.aspx
*/
[TestCleanup]
public void TearDown()
{
    driver.Quit();
}
  
    # UnitTest Teardown
    # https://www.rubydoc.info/github/test-unit/test-unit/Test/Unit/TestCase
def teardown
    @driver.quit
end
  
/**
 * Example using Mocha
 * https://mochajs.org/#hooks
 */
after('Tear down', async function () {
  await driver.quit();
});
  
/**
 * Example using JUnit
 * https://junit.org/junit5/docs/current/api/org/junit/jupiter/api/AfterAll.html
 */
@AfterAll
fun tearDown() {
    driver.quit()
}
  

If not running WebDriver in a test context, you may consider using try / finally which is offered by most languages so that an exception will still clean up the WebDriver session.

try {
    //WebDriver code here...
} finally {
    driver.quit();
}
  
try:
    #WebDriver code here...
finally:
    driver.quit()
  
try {
    //WebDriver code here...
} finally {
    driver.Quit();
}
  
begin
    #WebDriver code here...
ensure
    driver.quit
end
  
try {
    //WebDriver code here...
} finally {
    await driver.quit();
}
  
try {
    //WebDriver code here...
} finally {
    driver.quit()
}
  

Python’s WebDriver now supports the python context manager, which when using the with keyword can automatically quit the driver at the end of execution.

with webdriver.Firefox() as driver:
  # WebDriver code here...

# WebDriver will automatically quit after indentation

Window management

Screen resolution can impact how your web application renders, so WebDriver provides mechanisms for moving and resizing the browser window.

Get window size

Fetches the size of the browser window in pixels.

//Access each dimension individually
int width = driver.manage().window().getSize().getWidth();
int height = driver.manage().window().getSize().getHeight();

//Or store the dimensions and query them later
Dimension size = driver.manage().window().getSize();
int width1 =