WebDriver drives a browser natively, as a user would, either locally
or on a remote machine using the Selenium server,
marks a leap forward in terms of browser automation.
Selenium WebDriver refers to both the language bindings
and the implementations of the individual browser controlling code.
This is commonly referred to as just WebDriver.
WebDriver is designed as a simple
and more concise programming interface.
WebDriver is a compact object-oriented API.
It drives the browser effectively.
1 - Getting started
If you are new to Selenium, we have a few resources that can help you get up to speed right away.
Selenium supports automation of all the major browsers in the market
through the use of WebDriver.
WebDriver is an API and protocol that defines a language-neutral interface
for controlling the behaviour of web browsers.
Each browser is backed by a specific WebDriver implementation, called a driver.
The driver is the component responsible for delegating down to the browser,
and handles communication to and from Selenium and the browser.
This separation is part of a conscious effort to have browser vendors
take responsibility for the implementation for their browsers.
Selenium makes use of these third party drivers where possible,
but also provides its own drivers maintained by the project
for the cases when this is not a reality.
The Selenium framework ties all of these pieces together
through a user-facing interface that enables the different browser backends
to be used transparently,
enabling cross-browser and cross-platform automation.
Selenium setup is quite different from the setup of other commercial tools.
Before you can start writing Selenium code, you have to
install the language bindings libraries for your language of choice, the browser you
want to use, and the driver for that browser.
Follow the links below to get up and going with Selenium WebDriver.
If you wish to start with a low-code/record and playback tool, please check
Selenium IDE
Once you get things working, if you want to scale up your tests, check out the
Selenium Grid.
1.1 - Install a Selenium library
Setting up the Selenium library for your favourite programming language.
First you need to install the Selenium bindings for your automation project.
The installation process for libraries depends on the language you choose to use.
Make sure you check the Selenium downloads page to make sure
you are using the latest version.
Further items of note for using Visual Studio Code (vscode) and C#
Install the compatible .NET SDK as per the section above.
Also install the vscode extensions (Ctrl-Shift-X) for C# and NuGet.
Follow the instruction here
to create and run the “Hello World” console project using C#.
You may also create a NUnit starter project using the command line dotnet new NUnit.
Make sure the file %appdata%\NuGet\nuget.config is configured properly as some developers reported that it will be empty due to some issues.
If nuget.config is empty, or not configured properly, then .NET builds will fail for Selenium Projects.
Add the following section to the file nuget.config if it is empty:
For more info about nuget.configclick here.
You may have to customize nuget.config to meet you needs.
Now, go back to vscode, press Ctrl-Shift-P, and type “NuGet Add Package”, and enter the required Selenium packages such as Selenium.WebDriver.
Press Enter and select the version.
Now you can use the examples in the documentation related to C# with vscode.
You can see the minimum required version of Ruby for any given Selenium version
on rubygems.org
Setting up your system to allow a browser to be automated.
Through WebDriver, Selenium supports all major browsers on the market
such as Chrome/Chromium, Firefox, Internet Explorer, Edge, and Safari.
Where possible, WebDriver drives the browser
using the browser’s built-in support for automation.
Since all the driver implementations except for Internet Explorer are provided by the
browser vendors themselves, they are not included in the standard Selenium distribution.
This section explains the basic requirements for getting started with the different browsers.
Read about more advanced options for starting a driver
in our driver configuration documentation.
Selenium Manager helps you to get a working environment to run Selenium out of the box. Beta 1
of Selenium Manager will configure the drivers for Chrome, Firefox, and Edge if they are not
found on the PATH. No extra configuration is needed. Future releases of Selenium Manager
will eventually even download browsers if necessary.
Most machines automatically update the browser, but the driver does not. To make sure you get
the correct driver for your browser, there are many third party libraries to assist you.
This option first requires manually downloading the driver (See Quick Reference Section for links).
This is a flexible option to change location of drivers without having to update your code, and will work
on multiple machines without requiring that each machine put the drivers in the same place.
You can either place the drivers in a directory that is already listed in PATH, or you can place them in a directory
and add it to PATH.
You can test if it has been added correctly by starting the driver:
chromedriver
To see what directories are already on PATH, open a Command Prompt and execute:
echo %PATH%
If the location to your driver is not already in a directory listed,
you can add a new directory to PATH:
setx PATH "%PATH%;C:\WebDriver\bin"
You can test if it has been added correctly by starting the driver:
chromedriver.exe
If your PATH is configured correctly above,
you will see some output relating to the startup of the driver:
Starting ChromeDriver 95.0.4638.54 (d31a821ec901f68d0d34ccdbaea45b4c86ce543e-refs/branch-heads/4638@{#871}) on port 9515
Only local connections are allowed.
Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
ChromeDriver was started successfully.
You can regain control of your command prompt by pressing Ctrl+C
4. Hard Coded Location
Similar to Option 3 above, you need to manually download the driver (See Quick Reference Section for links).
Specifying the location in the code itself has the advantage of not needing to figure out Environment Variables on
your system, but has the drawback of making the code much less flexible.
Everything Selenium does is send the browser commands to do something or send requests for information.
Most of what you’ll do with Selenium is a combination of these basic commands:
1. Start the session
For more details on starting a session read our documentation on driver sessions
Synchronizing the code with the current state of the browser is one of the biggest challenges
with Selenium, and doing it well is an advanced topic.
Essentially you want to make sure that the element is on the page before you attempt to locate it
and the element is in an interactable state before you attempt to interact with it.
An implicit wait is rarely the best solution, but it’s the easiest to demonstrate here, so
we’ll use it as a placeholder.
If you are using Selenium for testing,
you will want to execute your Selenium code using test runner tools.
Many of the code examples in this documentation can be found in our example repositories.
There are multiple options in each language, but here is what we are using in our examples:
Install Mocha Test runner using below command in your terminal
npm install mocha
and run your tests using below command
mocha firstScript.spec.js
// Add instructions
Next Steps
Take what you’ve learned and build out your Selenium code.
As you find more functionality that you need, read up on the rest of our
WebDriver documentation.
1.4 - Upgrade to Selenium 4
Are you still using Selenium 3? This guide will help you upgrade to the latest release!
Upgrading to Selenium 4 should be a painless process if you are using one of the officially
supported languages (Ruby, JavaScript, C#, Python, and Java). There might be some cases where
a few issues can happen, and this guide will help you to sort them out. We will go through
the steps to upgrade your project dependencies and understand the major deprecations and
changes the version upgrade brings.
These are the steps we will follow to upgrade to Selenium 4:
Preparing our test code
Upgrading dependencies
Potential errors and deprecation messages
Note: while Selenium 3.x versions were being developed, support for the W3C WebDriver standard
was implemented. Both this new protocol and the legacy JSON Wire Protocol were supported. Around
version 3.11, Selenium code became compliant with the level W3C 1 specification. The W3C compliant
code in the latest version of Selenium 3 will work as expected in Selenium 4.
Preparing our test code
Selenium 4 removes support for the legacy protocol and uses the W3C WebDriver standard by
default under the hood. For most things, this implementation will not affect end users.
The major exceptions are Capabilities and the Actions class.
Capabilities
If the test capabilities are not structured to be W3C compliant, may cause a session to not
be started. Here is the list of W3C WebDriver standard capabilities:
browserName
browserVersion (replaces version)
platformName (replaces platform)
acceptInsecureCerts
pageLoadStrategy
proxy
timeouts
unhandledPromptBehavior
An up-to-date list of standard capabilities can be found at
W3C WebDriver.
Any capability that is not contained in the list above, needs to include a vendor prefix.
This applies to browser specific capabilities as well as cloud vendor specific capabilities.
For example, if your cloud vendor uses build and name capabilities for your tests, you need
to wrap them in a cloud:options block (check with your cloud vendor for the appropriate prefix).
The utility methods to find elements in the Java bindings (FindsBy interfaces) have been removed
as they were meant for internal use only. The following code samples explain this better.
Check the subsections below to install Selenium 4 and have your project dependencies upgraded.
Java
The process of upgrading Selenium depends on which build tool is being used. We will cover the
most common ones for Java, which are Maven and
Gradle. The minimum Java version required is still 8.
Maven
Before
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>3.141.59</version></dependency><!-- more dependencies ... --></dependencies>
After
<dependencies><!-- more dependencies ... --><dependency><groupId>org.seleniumhq.selenium</groupId><artifactId>selenium-java</artifactId><version>4.4.0</version></dependency><!-- more dependencies ... --></dependencies>
After making the change, you could execute mvn clean compile on the same directory where the
pom.xml file is.
Gradle
Before
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '3.141.59'
}
test {
useJUnitPlatform()
}
After
plugins {
id 'java'
}
group 'org.example'
version '1.0-SNAPSHOT'
repositories {
mavenCentral()
}
dependencies {
testImplementation 'org.junit.jupiter:junit-jupiter-api:5.7.0'
testRuntimeOnly 'org.junit.jupiter:junit-jupiter-engine:5.7.0'
implementation group: 'org.seleniumhq.selenium', name: 'selenium-java', version: '4.4.0'
}
test {
useJUnitPlatform()
}
After making the change, you could execute ./gradlew clean build
on the same directory where the build.gradle file is.
To check all the Java releases, you can head to MVNRepository.
C#
The place to get updates for Selenium 4 in C# is NuGet. Under the
Selenium.WebDriver package you
can get the instructions to update to the latest version. Inside of Visual Studio, through the
NuGet Package Manager you can execute:
The most important change to use Python is the minimum required version. Selenium 4 will
require a minimum Python 3.7 or higher. More details can be found at the
Python Package Index. To upgrade from the
command line, you can execute:
pip install selenium==4.4.3
Ruby
The update details for Selenium 4 can be seen at the
selenium-webdriver
gem in RubyGems. To install the latest version, you can execute:
gem install selenium-webdriver
To add it to your Gemfile:
gem 'selenium-webdriver', '~> 4.4.0'
JavaScript
The selenium-webdriver package can be found at the Node package manager,
npmjs. Selenium 4 can be found
here. To install it, you
could either execute:
Waits are also expecting different parameters now. WebDriverWait is now expecting a Duration
instead of a long for timeout in seconds and milliseconds. The withTimeout and pollingEvery
utility methods from FluentWait have switched from expecting (long time, TimeUnit unit) to
expect (Duration duration).
Merging capabilities is no longer changing the calling object
It was possible to merge a different set of capabilities into another set, and it was
mutating the calling object. Now, the result of the merge operation needs to be assigned.
The result of the merge call needs to be assigned to an object.
Firefox Legacy
Before GeckoDriver was around, the Selenium project had a driver implementation to automate
Firefox (version <48). However, this implementation is not needed anymore as it does not work
in recent versions of Firefox. To avoid major issues when upgrading to Selenium 4, the setLegacy
option will be shown as deprecated. The recommendation is to stop using the old implementation
and rely only on GeckoDriver. The following code will show the setLegacy line deprecated after
upgrading.
executable_path has been deprecated, please pass in a Service object
In Selenium 4, you’ll need to set the driver’s executable_path from a Service object to prevent deprecation warnings. (Or don’t set the path and instead make sure that the driver you need is on the System PATH.)
We went through the major changes to be taken into consideration when upgrading to Selenium 4.
Covering the different aspects to cover when test code is prepared for the upgrade, including
suggestions on how to prevent potential issues that can show up when using the new version of
Selenium. To finalize, we also covered a set of possible issues that you can bump into after
upgrading, and we shared potential fixes for those issues.
The primary unique argument for starting a remote driver includes information about where to execute the code.
Read the details in the Remote Driver Section
In Selenium 3, capabilities were defined in a session by using Desired Capabilities classes.
As of Selenium 4, you must use the browser options classes.
For remote driver sessions, a browser options instance is required as it determines which browser will be used.
These options are described in the w3c specification for Capabilities.
Each browser has custom options that may be defined in addition to the ones defined in the specification.
browserName
This capability is used to set the browserName for a given session.
If the specified browser is not installed at the
remote end, the session creation will fail.
browserVersion
This capability is optional, this is used to
set the available browser version at remote end.
For Example, if ask for Chrome version 75 on a system that
only has 80 installed, the session creation will fail.
pageLoadStrategy
Three types of page load strategies are available.
The page load strategy queries the
document.readyState
as described in the table below:
Strategy
Ready State
Notes
normal
complete
Used by default, waits for all resources to download
eager
interactive
DOM access is ready, but other resources like images may still be loading
none
Any
Does not block WebDriver at all
The document.readyState property of a document describes the loading state of the current document.
When navigating to a new page via URL, by default, WebDriver will hold off on completing a navigation
method (e.g., driver.navigate().get()) until the document ready state is complete. This does not
necessarily mean that the page has finished loading, especially for sites like Single Page Applications
that use JavaScript to dynamically load content after the Ready State returns complete. Note also
that this behavior does not apply to navigation that is a result of clicking an element or submitting a form.
If a page takes a long time to load as a result of downloading assets (e.g., images, css, js)
that aren’t important to the automation, you can change from the default parameter of normal to
eager or none to speed up the session. This value applies to the entire session, so make sure
that your waiting strategy is sufficient to minimize
flakiness.
normal (default)
WebDriver waits until the load
event fire is returned.
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NORMAL);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
it('Navigate using normal page loading strategy',asyncfunction(){letdriver=awaitenv.builder().setChromeOptions(options.setPageLoadStrategy('normal')).build();awaitdriver.get('https://www.google.com');
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.EAGER);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
importorg.openqa.selenium.PageLoadStrategy;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeOptions;importorg.openqa.selenium.chrome.ChromeDriver;publicclasspageLoadStrategy{publicstaticvoidmain(String[]args){ChromeOptionschromeOptions=newChromeOptions();chromeOptions.setPageLoadStrategy(PageLoadStrategy.NONE);WebDriverdriver=newChromeDriver(chromeOptions);try{// Navigate to Url
driver.get("https://google.com");}finally{driver.quit();}}}
This identifies the operating system at the remote-end,
fetching the platformName returns the OS name.
In cloud-based providers,
setting platformName sets the OS at the remote-end.
acceptInsecureCerts
This capability checks whether an expired (or)
invalid TLS Certificate is used while navigating
during a session.
If the capability is set to false, an
insecure certificate error
will be returned as navigation encounters any domain
certificate problems. If set to true, invalid certificate will be
trusted by the browser.
All self-signed certificates will be trusted by this capability by default.
Once set, acceptInsecureCerts capability will have an
effect for the entire session.
timeouts
A WebDriver session is imposed with a certain session timeout
interval, during which the user can control the behaviour
of executing scripts or retrieving information from the browser.
Each session timeout is configured with
combination of different timeouts as described below:
Script Timeout
Specifies when to interrupt an executing script in
a current browsing context. The default timeout 30,000
is imposed when a new session is created by WebDriver.
Page Load Timeout
Specifies the time interval in which web page
needs to be loaded in a current browsing context.
The default timeout 300,000 is imposed when a
new session is created by WebDriver. If page load limits
a given/default time frame, the script will be stopped by
TimeoutException.
Implicit Wait Timeout
This specifies the time to wait for the
implicit element location strategy when
locating elements. The default timeout 0
is imposed when a new session is created by WebDriver.
unhandledPromptBehavior
Specifies the state of current session’s user prompt handler.
Defaults to dismiss and notify state
User Prompt Handler
This defines what action must take when a
user prompt encounters at the remote-end. This is defined by
unhandledPromptBehavior capability and has the following states:
This new capability indicates if strict interactability checks
should be applied to input type=file elements. As strict interactability
checks are off by default, there is a change in behaviour
when using Element Send Keys with hidden file upload controls.
proxy
A proxy server acts as an intermediary for
requests between a client and a server. In simple,
the traffic flows through the proxy server
on its way to the address you requested and back.
A proxy server for automation scripts
with Selenium could be helpful for:
Capture network traffic
Mock backend calls made by the website
Access the required website under complex network
topologies or strict corporate restrictions/policies.
If you are in a corporate environment, and a
browser fails to connect to a URL, this is most
likely because the environment needs a proxy to be accessed.
Selenium WebDriver provides a way to proxy settings:
You can use WebDriver remotely the same way you would use it
locally. The primary difference is that a remote WebDriver needs to be
configured so that it can run your tests on a separate machine.
A remote WebDriver is composed of two pieces: a client and a
server. The client is your WebDriver test and the server is simply a
Java servlet, which can be hosted in any modern JEE app server.
To run a remote WebDriver client, we first need to connect to the RemoteWebDriver.
We do this by pointing the URL to the address of the server running our tests.
In order to customize our configuration, we set desired capabilities.
Below is an example of instantiating a remote WebDriver object
pointing to our remote web server, www.example.com,
running our tests on Firefox.
The Local File Detector allows the transfer of files from the client
machine to the remote server. For example, if a test needs to upload a
file to a web application, a remote WebDriver can automatically transfer
the file from the local machine to the remote web server during
runtime. This allows the file to be uploaded from the remote machine
running the test. It is not enabled by default and can be enabled in
the following way:
This feature is only available for Java client binding (Beta onwards). The Remote WebDriver client sends requests to the Selenium Grid server, which passes them to the WebDriver. Tracing should be enabled at the server and client-side to trace the HTTP requests end-to-end. Both ends should have a trace exporter setup pointing to the visualization framework.
By default, tracing is enabled for both client and server.
To set up the visualization framework Jaeger UI and Selenium Grid 4, please refer to Tracing Setup for the desired version.
For client-side setup, follow the steps below.
Add the required dependencies
Installation of external libraries for tracing exporter can be done using Maven.
Add the opentelemetry-exporter-jaeger and grpc-netty dependency in your project pom.xml:
Each browser has custom capabilities and unique features.
3.1 - Chrome specific functionality
These are capabilities and features specific to Google Chrome browsers.
By default, Selenium 4 is compatible with Chrome v75 and greater. Note that the version of
the Chrome browser and the version of chromedriver must match the major version.
Options
Capabilities common to all browsers are described on the Options page.
Here are a few common use cases with different capabilities:
Arguments
The args parameter is for a list of Command Line Switches
used when starting the browser.
Commonly used args include --start-maximized and --headless=new
The binary parameter takes the path of an alternate location of browser to use. With this parameter you can
use chromedriver to drive various Chromium based browsers.
Chrome adds various arguments, if you do not want those arguments added, pass them into excludeSwitches.
A common example is to turn the popup blocker back on.
See the Chrome DevTools section for more information about using Chrome DevTools
3.2 - Edge specific functionality
These are capabilities and features specific to Microsoft Edge browsers.
Microsoft Edge is implemented with Chromium, with the earliest supported version of v79. Similar to Chrome,
the major version number of edgedriver must match the major version of the Edge browser.
All capabilities and options found on the Chrome page work for Edge as well.
Options
Starting an Edge session with basic defined options looks like this:
The args parameter is for a list of Command Line Switches
used when starting the browser.
Commonly used args include --start-maximized and --headless=new
Microsoft Edge can be driven in “Internet Explorer Compatibility Mode”, which uses
the Internet Explorer Driver classes in conjunction with Microsoft Edge.
Read the Internet Explorer page for more details.
3.3 - Firefox specific functionality
These are capabilities and features specific to Mozilla Firefox browsers.
Selenium 4 requires Firefox 78 or greater. It is recommended to always use the latest version of geckodriver.
Options
Capabilities common to all browsers are described on the Options page.
Capabilities unique to Firefox can be found at Mozilla’s page for firefoxOptions
Starting a Firefox session with basic defined options looks like this:
The args parameter is for a list of Command line switches used when starting the browser. Commonly used args include -headless and "-profile", "/path/to/profile"
The binary parameter takes the path of an alternate location of browser to use. For example, with this parameter you can
use geckodriver to drive Firefox Nightly instead of the production version when both are present on your computer.
Add a browser location to options:
Coding Help
Note:
This section could use some updated code examples
const{Builder}=require("selenium-webdriver");constfirefox=require('selenium-webdriver/firefox');constoptions=newfirefox.Options();letprofile='/path to custom profile';options.setProfile(profile);constdriver=newBuilder().forBrowser('firefox').setFirefoxOptions(options).build();
When working with an unfinished or unpublished extension, it will likely not be signed. As such, it can only
be installed as “temporary.” This can be done by passing in either a zip file or a directory, here’s an
example with a directory:
These are capabilities and features specific to Microsoft Internet Explorer browsers.
As of June 2022, Selenium officially no longer supports standalone Internet Explorer.
The Internet Explorer driver still supports running Microsoft Edge in “IE Compatibility Mode.”
Special considerations
The IE Driver is the only driver maintained by the Selenium Project directly.
While binaries for both the 32-bit and 64-bit
versions of Internet Explorer are available, there are some
known limitations
with the 64-bit driver. As such it is recommended to use the 32-bit driver.
Additional information about using Internet Explorer can be found on the
IE Driver Server page
Options
Starting a Microsoft Edge browser in Internet Explorer Compatibility mode with basic defined options looks like this:
If IE is not present on the system (default in Windows 11), you do not need to
use the two parameters above. IE Driver will use Edge and will automatically locate it.
If IE and Edge are both present on the system, you only need to set attaching to Edge,
IE Driver will automatically locate Edge on your system.
As of Internet Explorer Driver v4.7.0:
No longer need to set Ignore Zoom Level for Edge in IE Mode
Here are a few common use cases with different capabilities:
fileUploadDialogTimeout
In some environments, Internet Explorer may timeout when opening the
File Upload dialog. IEDriver has a default timeout of 1000ms, but you
can increase the timeout using the fileUploadDialogTimeout capability.
When set to true, this capability clears the Cache,
Browser History and Cookies for all running instances
of InternetExplorer including those started manually
or by the driver. By default, it is set to false.
Using this capability will cause performance drop while
launching the browser, as the driver will wait until the cache
gets cleared before launching the IE browser.
This capability accepts a Boolean value as parameter.
InternetExplorer driver expects the browser zoom level to be 100%,
else the driver will throw an exception. This default behaviour
can be disabled by setting the ignoreZoomSetting to true.
This capability accepts a Boolean value as parameter.
Whether to skip the Protected Mode check while launching
a new IE session.
If not set and Protected Mode settings are not same for
all zones, an exception will be thrown by the driver.
If capability is set to true, tests may
become flaky, unresponsive, or browsers may hang.
However, this is still by far a second-best choice,
and the first choice should always be to actually
set the Protected Mode settings of each zone manually.
If a user is using this property,
only a “best effort” at support will be given.
This capability accepts a Boolean value as parameter.
<p><ahref=https://github.com/SeleniumHQ/seleniumhq.github.io/tree/trunk/examples><spanclass="selenium-badge-code"data-toggle="tooltip"data-placement="right"title="Code examples are added to the projects in examples directory of repo; see about section for contribution and style guids">AddExample</span></a></p>
Internet Explorer includes several command-line options
that enable you to troubleshoot and configure the browser.
The following describes few supported command-line options
-private : Used to start IE in private browsing mode. This works for IE 8 and later versions.
-k : Starts Internet Explorer in kiosk mode.
The browser opens in a maximized window that does not display the address bar, the navigation buttons, or the status bar.
-extoff : Starts IE in no add-on mode.
This option specifically used to troubleshoot problems with browser add-ons. Works in IE 7 and later versions.
Note: forceCreateProcessApi should to enabled in-order for command line arguments to work.
These are capabilities and features specific to Apple Safari browsers.
Unlike Chromium and Firefox drivers, the safaridriver is installed with the Operating System.
To enable automation on Safari, run the following command from the terminal:
safaridriver --enable
Options
Capabilities common to all browsers are described on the Options page.
WebDriver can generally be said to have a blocking API.
Because it is an out-of-process library that
instructs the browser what to do,
and because the web platform has an intrinsically asynchronous nature,
WebDriver does not track the active, real-time state of the DOM.
This comes with some challenges that we will discuss here.
From experience,
most intermittent issues that arise from use of Selenium and WebDriver
are connected to race conditions that occur between
the browser and the user’s instructions.
An example could be that the user instructs the browser to navigate to a page,
then gets a no such element error
when trying to find an element.
Consider the following document:
<!doctype html><metacharset=utf-8><title>Race Condition Example</title><script>varinitialised=false;window.addEventListener("load",function(){varnewElement=document.createElement("p");newElement.textContent="Hello from JavaScript!";document.body.appendChild(newElement);initialised=true;});</script>
The WebDriver instructions might look innocent enough:
driver.get("file:///race_condition.html");WebElementelement=driver.findElement(By.tagName("p"));assertEquals(element.getText(),"Hello from JavaScript!");
driver.navigate("file:///race_condition.html")el=driver.find_element(By.TAG_NAME,"p")assertel.text=="Hello from JavaScript!"
driver.Navigate().GoToUrl("file:///race_condition.html");IWebElementelement=driver.FindElement(By.TagName("p"));assertEquals(element.Text,"Hello from JavaScript!");
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxbegin# Navigate to URLdriver.get'file:///race_condition.html'# Get and store Paragraph Textsearch_form=driver.find_element(:css,'p').text"Hello from JavaScript!".eql?search_formensuredriver.quitend
awaitdriver.get('file:///race_condition.html');constelement=driver.findElement(By.css('p'));assert.strictEqual(awaitelement.getText(),'Hello from JavaScript!');
driver.get("file:///race_condition.html")valelement=driver.findElement(By.tagName("p"))assert(element.text=="Hello from JavaScript!")
The issue here is that the default
page load strategy
used in WebDriver listens for the document.readyState
to change to "complete" before returning from the call to navigate.
Because the p element is
added after the document has completed loading,
this WebDriver script might be intermittent.
It “might” be intermittent because no guarantees can be made
about elements or events that trigger asynchronously
without explicitly waiting—or blocking—on those events.
Fortunately, the normal instruction set available on
the WebElement interface—such
as WebElement.click and WebElement.sendKeys—are
guaranteed to be synchronous,
in that the function calls will not return
(or the callback will not trigger in callback-style languages)
until the command has been completed in the browser.
The advanced user interaction APIs,
Keyboard
and Mouse,
are exceptions as they are explicitly intended as
“do what I say” asynchronous commands.
Waiting is having the automated task execution
elapse a certain amount of time before continuing with the next step.
To overcome the problem of race conditions
between the browser and your WebDriver script,
most Selenium clients ship with a wait package.
When employing a wait,
you are using what is commonly referred to
as an explicit wait.
Explicit wait
Explicit waits are available to Selenium clients
for imperative, procedural languages.
They allow your code to halt program execution,
or freeze the thread,
until the condition you pass it resolves.
The condition is called with a certain frequency
until the timeout of the wait is elapsed.
This means that for as long as the condition returns a falsy value,
it will keep trying and waiting.
Since explicit waits allow you to wait for a condition to occur,
they make a good fit for synchronising the state between the browser and its DOM,
and your WebDriver script.
To remedy our buggy instruction set from earlier,
we could employ a wait to have the findElement call
wait until the dynamically added element from the script
has been added to the DOM:
WebDriverdriver=newChromeDriver();driver.get("https://google.com/ncr");driver.findElement(By.name("q")).sendKeys("cheese"+Keys.ENTER);// Initialize and wait till element(link) became clickable - timeout in 10 seconds
WebElementfirstResult=newWebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")));// Print the first result
System.out.println(firstResult.getText());
fromselenium.webdriver.support.waitimportWebDriverWaitdefdocument_initialised(driver):returndriver.execute_script("return initialised")driver.navigate("file:///race_condition.html")WebDriverWait(driver,timeout=10).until(document_initialised)el=driver.find_element(By.TAG_NAME,"p")assertel.text=="Hello from JavaScript!"
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxwait=Selenium::WebDriver::Wait.new(:timeout=>10)defdocument_initialised(driver)driver.execute_script('return initialised')endbegindriver.get'file:///race_condition.html'wait.until{document_initialiseddriver}search_form=driver.find_element(:css,'p').text"Hello from JavaScript!".eql?search_formensuredriver.quitend
constdocumentInitialised=()=>driver.executeScript('return initialised');awaitdriver.get('file:///race_condition.html');awaitdriver.wait(()=>documentInitialised(),10000);constelement=driver.findElement(By.css('p'));assert.strictEqual(awaitelement.getText(),'Hello from JavaScript!');
driver.get("https://google.com/ncr")driver.findElement(By.name("q")).sendKeys("cheese"+Keys.ENTER)// Initialize and wait till element(link) became clickable - timeout in 10 seconds
valfirstResult=WebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.elementToBeClickable(By.xpath("//a/h3")))// Print the first result
println(firstResult.text)
We pass in the condition as a function reference
that the wait will run repeatedly until its return value is truthy.
A “truthful” return value is anything that evaluates to boolean true
in the language at hand, such as a string, number, a boolean,
an object (including a WebElement),
or a populated (non-empty) sequence or list.
That means an empty list evaluates to false.
When the condition is truthful and the blocking wait is aborted,
the return value from the condition becomes the return value of the wait.
With this knowledge,
and because the wait utility ignores no such element errors by default,
we can refactor our instructions to be more concise:
WebElementfoo=newWebDriverWait(driver,Duration.ofSeconds(3)).until(driver->driver.findElement(By.name("q")));assertEquals(foo.getText(),"Hello from JavaScript!");
fromselenium.webdriver.support.waitimportWebDriverWaitdriver.navigate("file:///race_condition.html")el=WebDriverWait(driver,timeout=3).until(lambdad:d.find_element(By.TAG_NAME,"p"))assertel.text=="Hello from JavaScript!"
using(vardriver=newFirefoxDriver()){varfoo=newWebDriverWait(driver,TimeSpan.FromSeconds(3)).Until(drv=>drv.FindElement(By.Name("q")));Debug.Assert(foo.Text.Equals("Hello from JavaScript!"));}
driver.get'file:///race_condition.html'wait=Selenium::WebDriver::Wait.new(:timeout=>10)ele=wait.until{driver.find_element(css:'p')}foo=ele.textassert_matchfoo,'Hello from JavaScript'
letele=awaitdriver.wait(until.elementLocated(By.css('p')),10000);letfoo=awaitele.getText();assert(foo=="Hello from JavaScript");
driver.get("file:///race_condition.html")valele=WebDriverWait(driver,Duration.ofSeconds(10)).until(ExpectedConditions.presenceOfElementLocated(By.tagName("p")))assert(ele.text=="Hello from JavaScript!")
In that example, we pass in an anonymous function
(but we could also define it explicitly as we did earlier so it may be reused).
The first and only argument that is passed to our condition
is always a reference to our driver object, WebDriver.
In a multi-threaded environment, you should be careful
to operate on the driver reference passed in to the condition
rather than the reference to the driver in the outer scope.
Because the wait will swallow no such element errors
that are raised when the element is not found,
the condition will retry until the element is found.
Then it will take the return value, a WebElement,
and pass it back through to our script.
If the condition fails,
e.g. a truthful return value from the condition is never reached,
the wait will throw/raise an error/exception called a timeout error.
Options
The wait condition can be customised to match your needs.
Sometimes it is unnecessary to wait the full extent of the default timeout,
as the penalty for not hitting a successful condition can be expensive.
The wait lets you pass in an argument to override the timeout:
Because it is quite a common occurrence
to have to synchronise the DOM and your instructions,
most clients also come with a set of predefined expected conditions.
As might be obvious by the name,
they are conditions that are predefined for frequent wait operations.
The conditions available in the different language bindings vary,
but this is a non-exhaustive list of a few:
alert is present
element exists
element is visible
title contains
title is
element staleness
visible text
You can refer to the API documentation for each client binding
to find an exhaustive list of expected conditions:
There is a second type of wait that is distinct from
explicit wait called implicit wait.
By implicitly waiting, WebDriver polls the DOM
for a certain duration when trying to find any element.
This can be useful when certain elements on the webpage
are not available immediately and need some time to load.
Implicit waiting for elements to appear is disabled by default
and will need to be manually enabled on a per-session basis.
Mixing explicit waits and implicit waits
will cause unintended consequences, namely waits sleeping for the maximum
time even if the element is available or condition is true.
Warning:
Do not mix implicit and explicit waits.
Doing so can cause unpredictable wait times.
For example, setting an implicit wait of 10 seconds
and an explicit wait of 15 seconds
could cause a timeout to occur after 20 seconds.
An implicit wait is to tell WebDriver to poll the DOM
for a certain amount of time when trying to find an element or elements
if they are not immediately available.
The default setting is 0, meaning disabled.
Once set, the implicit wait is set for the life of the session.
FluentWait instance defines the maximum amount of time to wait for a condition,
as well as the frequency with which to check the condition.
Users may configure the wait to ignore specific types of exceptions whilst waiting,
such as NoSuchElementException when searching for an element on the page.
// Waiting 30 seconds for an element to be present on the page, checking
// for its presence once every 5 seconds.
Wait<WebDriver>wait=newFluentWait<WebDriver>(driver).withTimeout(Duration.ofSeconds(30)).pollingEvery(Duration.ofSeconds(5)).ignoring(NoSuchElementException.class);WebElementfoo=wait.until(driver->{returndriver.findElement(By.id("foo"));});
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxexception=Selenium::WebDriver::Error::NoSuchElementErrorbegindriver.get'http://somedomain/url_that_delays_loading'wait=Selenium::WebDriver::Wait.new(timeout:30,interval:5,message:'Timed out after 30 sec',ignore:exception)foo=wait.until{driver.find_element(id:'foo')}ensuredriver.quitend
const{Builder,until}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('firefox').build();awaitdriver.get('http://somedomain/url_that_delays_loading');// Waiting 30 seconds for an element to be present on the page, checking
// for its presence once every 5 seconds.
letfoo=awaitdriver.wait(until.elementLocated(By.id('foo')),30000,'Timed out after 30 seconds',5000);})();
Identifying and working with element objects in the DOM.
The majority of most people’s Selenium code involves working with web elements.
5.1 - File Upload
The file upload dialog could be handled using Selenium,
when the input element is of type file.
An example of it, could be found on this
web page- https://the-internet.herokuapp.com/upload
We will require to have a file available with us,
which we need to upload.
The code to upload the file for different programming
languages will be as follows -
importjava.util.concurrent.TimeUnit;importorg.openqa.selenium.By;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.chrome.ChromeDriver;importio.github.bonigarcia.wdm.WebDriverManager;classfileUploadDoc{publicstaticvoidmain(String[]args){WebDriverManager.chromedriver().setup();WebDriverdriver=newChromeDriver();driver.manage().timeouts().implicitlyWait(10,TimeUnit.SECONDS);driver.get("https://the-internet.herokuapp.com/upload");//we want to import selenium-snapshot file.
driver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg");driver.findElement(By.id("file-submit")).submit();if(driver.getPageSource().contains("File Uploaded!")){System.out.println("file uploaded");}else{System.out.println("file not uploaded");}driver.quit();}}
fromseleniumimportwebdriverfromwebdriver_manager.chromeimportChromeDriverManagerdriver=webdriver.Chrome(ChromeDriverManager().install())driver.implicitly_wait(10)driver.get("https://the-internet.herokuapp.com/upload");driver.find_element(By.ID,"file-upload").send_keys("selenium-snapshot.jpg")driver.find_element(By.ID,"file-submit").submit()if(driver.page_source.find("File Uploaded!")):print("file upload success")else:print("file upload not successful")driver.quit()
usingSystem;usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceSeleniumDocumentation.SeleniumPRs{classFileUploadExample{staticvoidMain(String[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://the-internet.herokuapp.com/upload");driver.FindElement(By.Id("file-upload")).SendKeys("selenium-snapshot.jpg");driver.FindElement(By.Id("file-submit")).Submit();if(driver.PageSource.Contains("File Uploaded!")){Console.WriteLine("file uploaded");}else{Console.WriteLine("file not uploaded");}driver.Quit();}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromedriver.get("https://the-internet.herokuapp.com/upload")driver.find_element(:id,"file-upload").send_keys("selenium-snapshot.jpg")driver.find_element(:id,"file-submit").submit()ifdriver.page_source().include?"File Uploaded!"puts"file upload success"elseputs"file upload not successful"end
import{Builder,By}from"selenium-webdriver"letdriver=awaitnewBuilder().forBrowser('chrome').build()awaitdriver.get("https://the-internet.herokuapp.com/upload");awaitdriver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg")awaitdriver.findElement(By.id("file-submit")).submit()driver.getPageSource().then(result=>{if(result.indexOf("File Uploaded!")){console.log("file upload success")}else{console.log("file upload not successful")}})
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()driver.get("https://the-internet.herokuapp.com/upload")driver.findElement(By.id("file-upload")).sendKeys("selenium-snapshot.jpg")driver.findElement(By.id("file-submit")).submit()if(driver.pageSource.contains("File Uploaded!")){println("file uploaded")}else{println("file not uploaded")}}
So the above example code helps to understand
how we can upload a file using Selenium.
5.2 - Locator strategies
Ways to identify one or more specific elements in the DOM.
A locator is a way to identify elements on a page. It is the argument passed to the
Finding element methods.
Check out our encouraged test practices for tips on
locators, including which to use when and
why to declare locators separately from the finding methods.
Traditional Locators
Selenium provides support for these 8 traditional location strategies in WebDriver:
Locator
Description
class name
Locates elements whose class name contains the search value (compound class names are not permitted)
css selector
Locates elements matching a CSS selector
id
Locates elements whose ID attribute matches the search value
name
Locates elements whose NAME attribute matches the search value
link text
Locates anchor elements whose visible text matches the search value
partial link text
Locates anchor elements whose visible text contains the search value. If multiple elements are matching, only the first one will be selected.
tag name
Locates elements whose tag name matches the search value
xpath
Locates elements matching an XPath expression
Coding Help
Note:
This section could use some updated code examples
Selenium 4 introduces Relative Locators (previously
called as Friendly Locators). These locators are helpful when it is not easy to construct a locator for
the desired element, but easy to describe spatially where the element is in relation to an element that does have
an easily constructed locator.
How it works
Selenium uses the JavaScript function
getBoundingClientRect()
to determine the size and position of elements on the page, and can use this information to locate neighboring elements. find the relative elements.
Relative locator methods can take as the argument for the point of origin, either a previously located element reference,
or another locator. In these examples we’ll be using locators only, but you could swap the locator in the final method with
an element object and it will work the same.
Let us consider the below example for understanding the relative locators.
Available relative locators
Above
If the email text field element is not easily identifiable for some reason, but the password text field element is,
we can locate the text field element using the fact that it is an “input” element “above” the password element.
If the password text field element is not easily identifiable for some reason, but the email text field element is,
we can locate the text field element using the fact that it is an “input” element “below” the email element.
If the cancel button is not easily identifiable for some reason, but the submit button element is,
we can locate the cancel button element using the fact that it is a “button” element to the “left of” the submit element.
If the submit button is not easily identifiable for some reason, but the cancel button element is,
we can locate the submit button element using the fact that it is a “button” element “to the right of” the cancel element.
If the relative positioning is not obvious, or it varies based on window size, you can use the near method to
identify an element that is at most 50px away from the provided locator.
One great use case for this is to work with a form element that doesn’t have an easily constructed locator,
but its associated input label element does.
You can also chain locators if needed. Sometimes the element is most easily identified as being both above/below one element and right/left of another.
Locating the elements based on the provided locator values.
One of the most fundamental aspects of using Selenium is obtaining element references to work with.
Selenium offers a number of built-in locator strategies to uniquely identify an element.
There are many ways to use the locators in very advanced scenarios. For the purposes of this documentation,
let’s consider this HTML snippet:
<olid="vegetables"><liclass="potatoes">…
<liclass="onions">…
<liclass="tomatoes"><span>Tomato is a Vegetable</span>…
</ol><ulid="fruits"><liclass="bananas">…
<liclass="apples">…
<liclass="tomatoes"><span>Tomato is a Fruit</span>…
</ul>
First matching element
Many locators will match multiple elements on the page. The singular find element method will return a reference to the
first element found within a given context.
Evaluating entire DOM
When the find element method is called on the driver instance, it
returns a reference to the first element in the DOM that matches with the provided locator.
This value can be stored and used for future element actions. In our example HTML above, there are
two elements that have a class name of “tomatoes” so this method will return the element in the “vegetables” list.
Rather than finding a unique locator in the entire DOM, it is often useful to narrow the search to the scope
of another located element. In the above example there are two elements with a class name of “tomatoes” and
it is a little more challenging to get the reference for the second one.
One solution is to locate an element with a unique attribute that is an ancestor of the desired element and not an
ancestor of the undesired element, then call find element on that object:
Java and C# WebDriver, WebElement and ShadowRoot classes all implement a SearchContext interface, which is
considered a role-based interface. Role-based interfaces allow you to determine whether a particular
driver implementation supports a given feature. These interfaces are clearly defined and try
to adhere to having only a single role of responsibility.
Optimized locator
A nested lookup might not be the most effective location strategy since it requires two
separate commands to be issued to the browser.
There are several use cases for needing to get references to all elements that match a locator, rather
than just the first one. The plural find elements methods return a collection of element references.
If there are no matches, an empty list is returned. In this case,
references to all fruits and vegetable list items will be returned in a collection.
Often you get a collection of elements but want to work with a specific element, which means you
need to iterate over the collection and identify the one you want.
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Firefox()# Navigate to Urldriver.get("https://www.example.com")# Get all the elements available with tag name 'p'elements=driver.find_elements(By.TAG_NAME,'p')foreinelements:print(e.text)
usingOpenQA.Selenium;usingOpenQA.Selenium.Firefox;usingSystem.Collections.Generic;namespaceFindElementsExample{classFindElementsExample{publicstaticvoidMain(string[]args){IWebDriverdriver=newFirefoxDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://example.com");// Get all the elements available with tag name 'p'IList<IWebElement>elements=driver.FindElements(By.TagName("p"));foreach(IWebElementeinelements){System.Console.WriteLine(e.Text);}}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxbegin# Navigate to URLdriver.get'https://www.example.com'# Get all the elements available with tag name 'p'elements=driver.find_elements(:tag_name,'p')elements.each{|e|putse.text}ensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('firefox').build();try{// Navigate to Url
awaitdriver.get('https://www.example.com');// Get all the elements available with tag 'p'
letelements=awaitdriver.findElements(By.css('p'));for(leteofelements){console.log(awaite.getText());}}finally{awaitdriver.quit();}})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.firefox.FirefoxDriverfunmain(){valdriver=FirefoxDriver()try{driver.get("https://example.com")// Get all the elements available with tag name 'p'
valelements=driver.findElements(By.tagName("p"))for(elementinelements){println("Paragraph text:"+element.text)}}finally{driver.quit()}}
Find Elements From Element
It is used to find the list of matching child WebElements within the context of parent element.
To achieve this, the parent WebElement is chained with ‘findElements’ to access child elements
importorg.openqa.selenium.By;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.WebElement;importorg.openqa.selenium.chrome.ChromeDriver;importjava.util.List;publicclassfindElementsFromElement{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("https://example.com");// Get element with tag name 'div'
WebElementelement=driver.findElement(By.tagName("div"));// Get all the elements available with tag name 'p'
List<WebElement>elements=element.findElements(By.tagName("p"));for(WebElemente:elements){System.out.println(e.getText());}}finally{driver.quit();}}}
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Chrome()driver.get("https://www.example.com")# Get element with tag name 'div'element=driver.find_element(By.TAG_NAME,'div')# Get all the elements available with tag name 'p'elements=element.find_elements(By.TAG_NAME,'p')foreinelements:print(e.text)
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;usingSystem.Collections.Generic;namespaceFindElementsFromElement{classFindElementsFromElement{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{driver.Navigate().GoToUrl("https://example.com");// Get element with tag name 'div'IWebElementelement=driver.FindElement(By.TagName("div"));// Get all the elements available with tag name 'p'IList<IWebElement>elements=element.FindElements(By.TagName("p"));foreach(IWebElementeinelements){System.Console.WriteLine(e.Text);}}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegin# Navigate to URLdriver.get'https://www.example.com'# Get element with tag name 'div'element=driver.find_element(:tag_name,'div')# Get all the elements available with tag name 'p'elements=element.find_elements(:tag_name,'p')elements.each{|e|putse.text}ensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=newBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.example.com');// Get element with tag name 'div'
letelement=driver.findElement(By.css("div"));// Get all the elements available with tag name 'p'
letelements=awaitelement.findElements(By.css("p"));for(leteofelements){console.log(awaite.getText());}})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://example.com")// Get element with tag name 'div'
valelement=driver.findElement(By.tagName("div"))// Get all the elements available with tag name 'p'
valelements=element.findElements(By.tagName("p"))for(einelements){println(e.text)}}finally{driver.quit()}}
Get Active Element
It is used to track (or) find DOM element which has the focus in the current browsing context.
importorg.openqa.selenium.*;importorg.openqa.selenium.chrome.ChromeDriver;publicclassactiveElementTest{publicstaticvoidmain(String[]args){WebDriverdriver=newChromeDriver();try{driver.get("http://www.google.com");driver.findElement(By.cssSelector("[name='q']")).sendKeys("webElement");// Get attribute of current active element
Stringattr=driver.switchTo().activeElement().getAttribute("title");System.out.println(attr);}finally{driver.quit();}}}
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportBydriver=webdriver.Chrome()driver.get("https://www.google.com")driver.find_element(By.CSS_SELECTOR,'[name="q"]').send_keys("webElement")# Get attribute of current active elementattr=driver.switch_to.active_element.get_attribute("title")print(attr)
usingOpenQA.Selenium;usingOpenQA.Selenium.Chrome;namespaceActiveElement{classActiveElement{publicstaticvoidMain(string[]args){IWebDriverdriver=newChromeDriver();try{// Navigate to Urldriver.Navigate().GoToUrl("https://www.google.com");driver.FindElement(By.CssSelector("[name='q']")).SendKeys("webElement");// Get attribute of current active elementstringattr=driver.SwitchTo().ActiveElement().GetAttribute("title");System.Console.WriteLine(attr);}finally{driver.Quit();}}}}
require'selenium-webdriver'driver=Selenium::WebDriver.for:chromebegindriver.get'https://www.google.com'driver.find_element(css:'[name="q"]').send_keys('webElement')# Get attribute of current active elementattr=driver.switch_to.active_element.attribute('title')putsattrensuredriver.quitend
const{Builder,By}=require('selenium-webdriver');(asyncfunctionexample(){letdriver=awaitnewBuilder().forBrowser('chrome').build();awaitdriver.get('https://www.google.com');awaitdriver.findElement(By.css('[name="q"]')).sendKeys("webElement");// Get attribute of current active element
letattr=awaitdriver.switchTo().activeElement().getAttribute("title");console.log(`${attr}`)})();
importorg.openqa.selenium.Byimportorg.openqa.selenium.chrome.ChromeDriverfunmain(){valdriver=ChromeDriver()try{driver.get("https://www.google.com")driver.findElement(By.cssSelector("[name='q']")).sendKeys("webElement")// Get attribute of current active element
valattr=driver.switchTo().activeElement().getAttribute("title")print(attr)}finally{driver.quit()}}
5.4 - Interacting with web elements
A high-level instruction set for manipulating form controls.
There are only 5 basic commands that can be executed on an element:
These methods are designed to closely emulate a user’s experience, so,
unlike the Actions API, it attempts to perform two things
before attempting the specified action.
If it determines the element is outside the viewport, it
scrolls the element into view, specifically
it will align the bottom of the element with the bottom of the viewport.
It ensures the element is interactable
before taking the action. This could mean that the scrolling was unsuccessful, or that the
element is not otherwise displayed. Determining if an element is displayed on a page was too difficult to
define directly in the webdriver specification,
so Selenium sends an execute command with a JavaScript atom that checks for things that would keep
the element from being displayed. If it determines an element is not in the viewport, not displayed, not
keyboard-interactable, or not
pointer-interactable,
it returns an element not interactable error.
The element send keys command
types the provided keys into an editable element.
Typically, this means an element is an input element of a form with a text type or an element
with a content-editable attribute. If it is not editable,
an invalid element state error is returned.
Here is the list of
possible keystrokes that WebDriver Supports.
importorg.openqa.selenium.By;importorg.openqa.selenium.Keys;importorg.openqa.selenium.WebDriver;importorg.openqa.selenium.firefox.FirefoxDriver;publicclassHelloSelenium{publicstaticvoidmain(String[]args){WebDriverdriver=newFirefoxDriver();try{// Navigate to Url
driver.get("https://google.com");// Enter text "q" and perform keyboard action "Enter"
driver.findElement(By.name("q")).sendKeys("q"+Keys.ENTER);}finally{driver.quit();}}}
fromseleniumimportwebdriverfromselenium.webdriver.common.byimportByfromselenium.webdriver.common.keysimportKeysdriver=webdriver.Firefox()# Navigate to urldriver.get("http://www.google.com")# Enter "webdriver" text and perform "ENTER" keyboard actiondriver.find_element(By.NAME,"q").send_keys("webdriver"+Keys.ENTER)
using(vardriver=newFirefoxDriver()){// Navigate to Urldriver.Navigate().GoToUrl("https://google.com");// Enter "webdriver" text and perform "ENTER" keyboard actiondriver.FindElement(By.Name("q")).SendKeys("webdriver"+Keys.Enter);}
require'selenium-webdriver'driver=Selenium::WebDriver.for:firefoxbegin# Navigate to URLdriver.get'https://google.com'# Enter "webdriver" text and perform "ENTER" keyboard actiondriver.find_element(name:'q').send_keys'webdriver',:returnensuredriver.quitend