Selenium is working with browser vendors to create the
WebDriver BiDirectional Protocol
as a means to provide a stable, cross-browser API that uses the bidirectional
functionality useful for both browser automation generally and testing specifically.
Before now, users seeking this functionality have had to rely on CDP (Chrome DevTools Protocol)
with all of its frustrations and limitations.
The traditional WebDriver model of strict request/response commands will be supplemented
with the ability to stream events from the user agent to the controlling software via WebSockets,
better matching the evented nature of the browser DOM.
As it is not a good idea to tie your tests to a specific version of any browser, the
Selenium project recommends using WebDriver BiDi wherever possible.
While the specification is in works, the browser vendors are parallely implementing
the WebDriver BiDirectional Protocol.
Refer web-platform-tests dashboard
to see how far along the browser vendors are.
Selenium is trying to keep up with the browser vendors and has started implementing W3C BiDi APIs.
The goal is to ensure APIs are W3C compliant and uniform among the different language bindings.
However, until the specification and corresponding Selenium implementation is complete there are many useful things that
CDP offers. Selenium offers some useful helper classes that use CDP.
1 - Chrome DevTools
Many browsers provide “DevTools” – a set of tools that are integrated with the browser that
developers can use to debug web apps and explore the performance of their pages. Google Chrome’s
DevTools make use of a protocol called the Chrome DevTools Protocol (or “CDP” for short).
As the name suggests, this is not designed for testing, nor to have a stable API, so functionality
is highly dependent on the version of the browser.
The WebDriver BiDirectional Protocol is the next generation of the
W3C WebDriver protocol and aims to provide a stable API implemented by all browsers, but it’s not yet complete.
Until it is, Selenium provides access to
the CDP for those browsers that implement it (such as Google Chrome, or Microsoft Edge, and
Firefox), allowing you to enhance your tests in interesting ways. Some examples of what you can
do with it are given below.
Ways to Use Chrome DevTools With Selenium
There are three different ways to access Chrome DevTools in Selenium. If you look for other examples online,
you will likely see each of these mixed and matched.
The CDP Endpoint was the first option available to users.
It only works for the most simple things (setting state, getting basic information), and you
have to know the “magic strings” for the domain and methods and key value pairs.
For basic requirements, this might be simpler than the other options. These methods are only temporarily supported.
The CDP API is an improvement on just using the endpoint because you can set
do things asynchronously. Instead of a String and a Map, you can access the supported classes,
methods and parameters in the code. These methods are also only temporarily supported.
The BiDi API option should be used whenever possible because it
abstracts away the implementation details entirely and will work with either CDP or WebDriver-BiDi
when Selenium moves away from CDP.
Examples With Limited Value
There are a number of commonly cited examples for using CDP that are of limited practical value.
Geo Location — almost all sites use the IP address to determine physical location,
so setting an emulated geolocation rarely has the desired effect.
Overriding Device Metrics — Chrome provides a great API for setting Mobile Emulation
in the Options classes, which is generally superior to attempting to do this with CDP.
Check out the examples in these documents for ways to do additional useful things:
1.1 - Chrome DevTools Protocol Endpoint
Google provides a /cdp/execute endpoint that can be accessed directly. Each Selenium binding provides a method that allows you to pass the CDP domain as a String, and the required parameters as a simple Map.
These methods will eventually be removed. It is recommended to use the WebDriver-BiDi or WebDriver Bidi APIs
methods where possible to ensure future compatibility.
Usage
Generally you should prefer the use of the CDP API over this approach,
but sometimes the syntax is cleaner or significantly more simple.
Limitations include:
It only works for use cases that are limited to setting or getting information;
any actual asynchronous interactions require another implementation
You have to know the exactly correct “magic strings” for domains and keys
It is possible that an update to Chrome will change the required parameters
Each of the Selenium bindings dynamically generates classes and methods for the various CDP domains and features; these are tied to specific versions of Chrome.
While Selenium 4 provides direct access to the Chrome DevTools Protocol (CDP), these
methods will eventually be removed. It is recommended to use the WebDriver Bidi APIs
methods where possible to ensure future compatibility.
Usage
If your use case has been implemented by WebDriver Bidi or
the BiDi API, you should use those implementations instead of this one.
Generally you should prefer this approach over executing with the CDP Endpoint,
especially in Ruby.
Wait for a download to finish before continuing.
Because getting download status requires setting a listener, this cannot be done with a CDP Endpoint implementation.
These examples are currently implemented with CDP, but the same code should work when the functionality is re-implemented with WebDriver-BiDi.
Usage
The following list of APIs will be growing as the Selenium
project works through supporting real world use cases. If there
is additional functionality you’d like to see, please raise a
feature request.
As these examples are re-implemented with the WebDriver-Bidi protocol, they will
be moved to the WebDriver Bidi pages.
Examples
Basic authentication
Some applications make use of browser authentication to secure pages.
It used to be common to handle them in the URL, but browser stopped supporting this.
With BiDi, you can now provide the credentials when necessary
This can be especially useful when executing on a remote server. For example,
whenever you check the visibility of an element, or whenever you use
the classic get attribute method, Selenium is sending the contents of a js file
to the script execution endpoint. These files are each about 50kB, which adds up.
The following list of APIs will be growing as the WebDriver BiDirectional Protocol grows
and browser vendors implement the same.
Additionally, Selenium will try to support real-world use cases that internally use a combination of W3C BiDi protocol APIs.
If there is additional functionality you’d like to see, please raise a
feature request.
2.1 - Browsing Context
This section contains the APIs related to browsing context commands.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new window. The implementation is operating system specific.
A reference browsing context is a top-level browsing context.
The API allows to pass the reference browsing context, which is used to create a new tab. The implementation is operating system specific.
Provides a tree of all browsing contexts descending from the parent browsing context, including the parent browsing context upto the depth value passed.
Listen to the JS Exceptions
and register callbacks to process the exception details.
logInspector.onJavaScriptLog(future::complete);driver.get("https://www.selenium.dev/selenium/web/bidi/logEntryAdded.html");driver.findElement(By.id("jsException")).click();JavascriptLogEntrylogEntry=future.get(5,TimeUnit.SECONDS);Assertions.assertEquals("Error: Not working",logEntry.getText());
constinspector=awaitLogInspector(driver)awaitinspector.onJavascriptException(function(log){logEntry=log})awaitdriver.get('https://www.selenium.dev/selenium/web/bidi/logEntryAdded.html')awaitdriver.findElement({id:'jsException'}).click()assert.equal(logEntry.text,'Error: Not working')assert.equal(logEntry.type,'javascript')assert.equal(logEntry.level,'error')