This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

BiDirectional functionality

1: Chrome DevTools

1.1: Chrome DevTools Protocol Endpoint
1.2: Chrome DevTools Protocol API
1.3: Chrome Devtools Protocol with BiDi API

2: BiDirectional API (W3C compliant)

2.1: Browsing Context
2.2: Input
2.3: Log
2.4: Network
2.5: Script

Selenium is working with browser vendors to create the WebDriver BiDirectional Protocol as a means to provide a stable, cross-browser API that uses the bidirectional functionality useful for both browser automation generally and testing specifically. Before now, users seeking this functionality have had to rely on CDP (Chrome DevTools Protocol) with all of its frustrations and limitations.

The traditional WebDriver model of strict request/response commands will be supplemented with the ability to stream events from the user agent to the controlling software via WebSockets, better matching the evented nature of the browser DOM.

As it is not a good idea to tie your tests to a specific version of any browser, the Selenium project recommends using WebDriver BiDi wherever possible.

While the specification is in works, the browser vendors are parallely implementing the WebDriver BiDirectional Protocol. Refer web-platform-tests dashboard to see how far along the browser vendors are. Selenium is trying to keep up with the browser vendors and has started implementing W3C BiDi APIs. The goal is to ensure APIs are W3C compliant and uniform among the different language bindings.

However, until the specification and corresponding Selenium implementation is complete there are many useful things that CDP offers. Selenium offers some useful helper classes that use CDP.

1 - Chrome DevTools

Many browsers provide “DevTools” – a set of tools that are integrated with the browser that developers can use to debug web apps and explore the performance of their pages. Google Chrome’s DevTools make use of a protocol called the Chrome DevTools Protocol (or “CDP” for short). As the name suggests, this is not designed for testing, nor to have a stable API, so functionality is highly dependent on the version of the browser.

The WebDriver BiDirectional Protocol is the next generation of the W3C WebDriver protocol and aims to provide a stable API implemented by all browsers, but it’s not yet complete. Until it is, Selenium provides access to the CDP for those browsers that implement it (such as Google Chrome, or Microsoft Edge, and Firefox), allowing you to enhance your tests in interesting ways. Some examples of what you can do with it are given below.

Ways to Use Chrome DevTools With Selenium

There are three different ways to access Chrome DevTools in Selenium. If you look for other examples online, you will likely see each of these mixed and matched.

The CDP Endpoint was the first option available to users. It only works for the most simple things (setting state, getting basic information), and you have to know the “magic strings” for the domain and methods and key value pairs. For basic requirements, this might be simpler than the other options. These methods are only temporarily supported.
The CDP API is an improvement on just using the endpoint because you can set do things asynchronously. Instead of a String and a Map, you can access the supported classes, methods and parameters in the code. These methods are also only temporarily supported.
The BiDi API option should be used whenever possible because it abstracts away the implementation details entirely and will work with either CDP or WebDriver-BiDi when Selenium moves away from CDP.

Examples With Limited Value

There are a number of commonly cited examples for using CDP that are of limited practical value.

Geo Location — almost all sites use the IP address to determine physical location, so setting an emulated geolocation rarely has the desired effect.
Overriding Device Metrics — Chrome provides a great API for setting Mobile Emulation in the Options classes, which is generally superior to attempting to do this with CDP.

Check out the examples in these documents for ways to do additional useful things:

1.1 - Chrome DevTools Protocol Endpoint

Google provides a /cdp/execute endpoint that can be accessed directly. Each Selenium binding provides a method that allows you to pass the CDP domain as a String, and the required parameters as a simple Map.

These methods will eventually be removed. It is recommended to use the WebDriver-BiDi or WebDriver Bidi APIs methods where possible to ensure future compatibility.

Usage

Generally you should prefer the use of the CDP API over this approach, but sometimes the syntax is cleaner or significantly more simple.

Limitations include:

It only works for use cases that are limited to setting or getting information; any actual asynchronous interactions require another implementation
You have to know the exactly correct “magic strings” for domains and keys
It is possible that an update to Chrome will change the required parameters

Examples

An alternate implementation can be found at CDP API Set Cookie

    Map<String, Object> cookie = new HashMap<>();
    cookie.put("name", "cheese");
    cookie.put("value", "gouda");
    cookie.put("domain", "www.selenium.dev");
    cookie.put("secure", true);

    ((HasCdp) driver).executeCdpCommand("Network.setCookie", cookie);

BiDirectional functionality

1 - Chrome DevTools

Ways to Use Chrome DevTools With Selenium

Examples With Limited Value

1.1 - Chrome DevTools Protocol Endpoint

Usage

Examples

Set Cookie

Performance Metrics

Basic authentication

1.2 - Chrome DevTools Protocol API

Usage

Examples

Set Cookie

Performance Metrics

Basic authentication

Console logs

JavaScript exceptions

Download complete

1.3 - Chrome Devtools Protocol with BiDi API

Usage

Examples

Basic authentication

Pin scripts

Mutation observation

Console logs and errors

JavaScript exceptions

Network Interception

Response information

Response transformation

Request interception

2 - BiDirectional API (W3C compliant)

2.1 - Browsing Context

Open a new window

Open a new tab

Use existing window handle

Open a window with a reference browsing context

Open a tab with a reference browsing context

Navigate to a URL

Navigate to a URL with readiness state

Get browsing context tree

Get browsing context tree with depth

Get All Top level browsing contexts

Close a tab/window

Activate a browsing context

Reload a browsing context

Handle user prompt

Capture Screenshot

Capture Viewport Screenshot

Capture Element Screenshot

Set Viewport

Print page

Navigate back

Navigate forward

Traverse history

2.2 - Input

Perform Actions

Release Actions

2.3 - Log

Console logs

JavaScript exceptions

Listen to JS Logs

2.4 - Network

Add network intercept

Remove network intercept

Continue request blocked at authRequired phase with credentials

Continue request blocked at authRequired phase without credentials

Cancel request blocked at authRequired phase

Fail request

2.5 - Script

Call function in a browsing context

Call function in a sandbox

Call function in a realm

Evaluate script in a browsing context

Evaluate script in a sandbox

Evaluate script in a realm

Disown handles in a browsing context

Disown handles in a realm

Get all realms

Get realm by type