These docs, like the code itself, are maintained 100% by volunteers
within the Selenium community.
Many have been using it since its inception,
but many more have only been using it for a short while,
and have given their time to help improve the onboarding experience
for new users.
If there is an issue with the documentation, we want to know!
The best way to communicate an issue is to visit
https://github.com/seleniumhq/seleniumhq.github.io/issues
and search to see whether or not the issue has been filed already.
If not, feel free to open one!
Many members of the community
are present at the #selenium
Libera chat at Libera.chat.
Feel free to drop in and ask questions
and if you get help which you think could be of use within these documents,
be sure to add your contribution!
We can update these documents,
but it is much easier for everyone when we get contributions
from outside the normal committers.
1 - Copyright and attributions
Copyright, contributions and all attributions for the different projects under the Selenium umbrella.
The Documentation of Selenium
Every effort has been made to make this documentation
as complete and as accurate as possible,
but no warranty or fitness is implied.
The information provided is on an “as-is” basis.
The authors and the publisher shall have
neither liability nor responsibility to any person or entity
with respect to any loss or damages arising
from the information contained in this book.
No patent liability is assumed with respect
to the use of the information contained herein.
All code and documentation originating from the Selenium project
is licensed under the Apache 2.0 license,
with the Software Freedom Conservancy
as the copyright holder.
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
2 - Contributing to the Selenium site & documentation
Information on improving documentation and code examples for Selenium
Selenium is a big software project, its site and documentation are key
to understanding how things work and learning effective ways to exploit
its potential.
This project contains both Selenium’s site and documentation. This is
an ongoing effort (not targeted at any specific release) to provide
updated information on how to use Selenium effectively, how to get
involved and how to contribute to Selenium.
Contributions toward the site and docs follow the process described in
the below section about contributions.
The Selenium project welcomes contributions from everyone. There are a
number of ways you can help:
Report an issue
When reporting a new issues or commenting on existing issues please
make sure discussions are related to concrete technical issues with the
Selenium software, its site and/or documentation.
All of the Selenium components change quite fast over time, so this
might cause the documentation to be out of date. If you find this to
be the case, as mentioned, don’t hesitate to create an issue for that.
It also might be possible that you know how to bring up to date the
documentation, so please send us a pull request with the related
changes.
If you are not sure about what you have found is an issue or not,
please ask through the communication channels described at
https://selenium.dev/support.
We want to be able to run all of our code examples in the CI to ensure that people can copy and paste and
execute everything on the site. So we put the code where it belongs in the
examples directory.
Each page in the documentation correlates to a test file in each of the languages, and should follow naming conventions.
For instance examples for this page https://www.selenium.dev/documentation/webdriver/browsers/chrome/ get added in these
files:
Each example should get its own test. Ideally each test has an assertion that verifies the code works as intended.
Once the code is copied to its own test in the proper file, it needs to be referenced in the markdown file.
For example, the tab in Ruby would look like this:
The line numbers at the end represent only the line or lines of code that actually represent the item being displayed.
If a user wants more context, they can click the link to the GitHub page that will show the full context.
Make sure that if you add a test to the page that all the other line numbers in the markdown file are still
correct. Adding a test at the top of a page means updating every single reference in the documentation that has a line
number for that file.
Everything from the Creating Examples section applies, with one addition.
Make sure the tab includes text=true. By default, the tabs get formatted
for code, so to use markdown or other shortcode statements (like gh-codeblock) it needs to be declared as text.
For most examples, the tabpane declares the text=true, but if some of the tabs have code examples, the tabpane
cannot specify it, and it must be specified in the tabs that do not need automatic code formatting.
Contribution Mechanics
The Selenium project welcomes new contributors. Individuals making
significant and valuable contributions over time are made Committers
and given commit-access to the project.
This guide will guide you through the contribution process.
Step 1: Fork
Fork the project on Github
and check out your copy locally.
% git clone git@github.com:seleniumhq/seleniumhq.github.io.git
% cd seleniumhq.github.io
Dependencies: Hugo
We use Hugo and the Docsy theme
to build and render the site. You will need the “extended”
Sass/SCSS version of the Hugo binary to work on this site. We recommend
to use Hugo 0.110.0 or higher.
Please follow the Install Hugo
instructions from Docsy.
Step 2: Branch
Create a feature branch and start hacking:
% git checkout -b my-feature-branch
We practice HEAD-based development, which means all changes are applied
directly on top of dev.
Step 3: Make changes
The repository contains the site and docs. To make changes to the site,
work on the website_and_docs directory. To see a live preview of
your changes, run hugo server on the site’s root directory.
% cd website_and_docs
% hugo server
The project loads code from GitHub, if that code has been updated, and it isn’t
reflected in your preview, you can run hugo without the cache: hugo server --ignoreCache
See Style Guide for more information on our conventions for contribution
Step 4: Commit
First make sure git knows your name and email address:
Writing good commit messages is important. A commit message
should describe what changed, why, and reference issues fixed (if
any). Follow these guidelines when writing one:
The first line should be around 50 characters or less and contain a
short description of the change.
Keep the second line blank.
Wrap all other lines at 72 columns.
Include Fixes #N, where N is the issue number the commit
fixes, if any.
A good commit message can look like this:
explain commit normatively in one line
Body of commit message is a few lines of text, explaining things
in more detail, possibly giving some background about the issue
being fixed, etc.
The body of the commit message can be several paragraphs, and
please do proper word-wrap and keep columns shorter than about
72 characters or so. That way `git log` will show things
nicely even when it is indented.
Fixes #141
The first line must be meaningful as it’s what people see when they
run git shortlog or git log --oneline.
Step 5: Rebase
Use git rebase (not git merge) to sync your work from time to time.
% git fetch origin
% git rebase origin/trunk
Step 6: Test
Always remember to run the local server,
with this you can be sure that your changes have not broken anything.
Pull requests are usually reviewed within a few days. If there are
comments to address, apply your changes in new commits (preferably
fixups) and push to the same
branch.
Step 8: Integration
When code review is complete, a committer will take your PR and
integrate it on the repository’s trunk branch. Because we like to keep a
linear history on the trunk branch, we will normally squash and rebase
your branch history.
Communication
All details on how to communicate with the project contributors
and the community overall can be found at https://selenium.dev/support
3 - Style guide for Selenium documentation
Conventions for contributions to the Selenium documentation and code examples
Read our contributing documentation for complete instructions on
how to add content to this documentation.
Alerts
Alerts have been added to direct potential contributors to where specific content is missing.
{{<alert-content/>}}
or
{{<alert-content>}}
Additional information about what specific content is needed
{{</alert-content>}}
Which gets displayed like this:
Content Help
Note:
This section needs additional and/or updated content
Additional information about what specific content is needed
Our documentation uses Title Capitalization for linkTitle which should be short
and Sentence capitalization for title which can be longer and more descriptive.
For example, a linkTitle of Special Heading might have a title of
The importance of a special heading in documentation
Line length
When editing the documentation’s source,
which is written in plain HTML,
limit your line lengths to around 100 characters.
Some of us take this one step further
and use what is called
semantic linefeeds,
which is a technique whereby the HTML source lines,
which are not read by the public,
are split at ‘natural breaks’ in the prose.
In other words, sentences are split
at natural breaks between clauses.
Instead of fussing with the lines of each paragraph
so that they all end near the right margin,
linefeeds can be added anywhere
that there is a break between ideas.
This can make diffs very easy to read
when collaborating through git,
but it is not something we enforce contributors to use.
Translations
Selenium now has official translators for each of the supported languages.
If you add a code example to the important_documentation.en.md file,
also add it to important_documentation.ja.md, important_documentation.pt-br.md,
important_documentation.zh-cn.md.
If you make text changes in the English version, just make a Pull Request.
The new process is for issues to be created and tagged as needs translation based on
changes made in a given PR.
Code examples
All references to code should be language independent,
and the code itself should be placed inside code tabs.
To generate the above tabs, this is what you need to write.
Note that the tabpane includes langEqualsHeader=true.
This auto-formats the code in each tab to match the header name,
and ensures that all tabs on the page with a language are set to the same thing.
To ensure that all code is kept up to date, our goal is to write the code in the repo where it
can be executed when Selenium versions are updated to ensure that everything is correct.
This code can be automatically displayed in the documentation using the gh-codeblock shortcode.
The shortcode automatically generates its own html, so we do not want it to auto-format with the language header.
If all tabs are using this shortcode, set text=true in the tabpane and remove langEqualsHeader=true.
If only some tabs are using this shortcode, keep langEqualsHeader=true in the tabpane and add text=true
to the tab. Note that the gh-codeblock line can not be indented at all.
One great thing about using gh-codeblock is that it adds a link to the full example.
This means you don’t have to include any additional context code, just the line(s) that
are needed, and the user can navigate to the repo to see how to use it.
If you want your example to include something other than code (default) or html (from gh-codeblock),
you need to first set text=true,
then change the Hugo syntax for the tabto use % instead of < and > with curly braces:
This is preferred to writing code comments because those will not be translated.
Only include the code that is needed for the documentation, and avoid over-explaining.
Finally, remember not to indent plain text or it will rendered as a codeblock.
4 - Musings about how things came to be
Details mostly of interest to Selenium devs about how and why certain parts of the project were created
This is a work in progress. Feel free to add things you know or remember.
How did the Automation Atoms come about?
On 2012-04-04, jimevans asked on the #selenium IRC channel:
“What I wanted to ask you about was the history of the automation atoms. I seem to remember them springing fully formed, as if from the head of Zeus, and I’m sure that wasn’t the case. Can you refresh my memory as to how the concept happened?”
simonstewart then proceeded to tell us a nice little story:
Sure. Are we sitting comfortably? Then I’ll begin. (Brit joke, there)
Imagine wavy lines as the screen dissolves and we’re transported back to when selenium and webdriver were different projects. Before the projects merged, there was an awful lot of congruent code in webdriver. Congruent, but not shared. The Firefox driver was in JS. The IE driver was mostly C++. The Chrome driver was mostly JS, but different JS from the Firefox driver. And HtmlUnit was unique.
We then added Selenium Core to the mix. Yet more JS that did basically the same thing.
Within Google, I was becoming the TL of the browser automation team. And was corralling a framework of our own into the mix. Which was written in JS, and had once been based on Core before it span off on its own path.
So: multiple codebases, lots of JS doing more or less the same thing. And loads of bugs. Weird mismatches of behaviour in edge-cases.
*shudder*
So I had a bit of a think. (Dangerous, I know) The idea was to extract the “best of breed” code from all three frameworks (Core, WebDriver and the Google tool). Break them down into code that could be shared. “The smallest, indivisible unit of browser automation” .
Or “atoms” for short.
These could be used as the basis the everything. Consistent behaviour between browsers. and apis. The other important point was that the JS code in webdriver and core was grown organically. Which is a polite way of saying “I’d rather never edit it again”. Which is a polite way of saying that it was of dubious quality . In places.
So: high quality was important. And I wanted the code broken up into modules. Because editing a 10k LOC file isn’t a bright idea.
Within Google we had a library called Closure. Which not only allowed modularization, but “denormalization” of modules into a single file via compilation. And I knew it was being open sourced. So we started building the library in the google codebase. (Where we had access to the unreleased library, code review tools and our amazing testing infrastructure). Using Closure Library.
“dom.js” was probably the first file I wrote. (We can check). Greg Dennis and Jason Leyba joined in the fun. And the atoms have been growing ever since.
Technically, we should be calling anything outside of “javascript/atoms” molecules. But then we can’t say that we have atomic drivers. and use imagery from the 50s to describe them.
*sigh*
jimevans replied: “molecular drivers?”
And simonstewart finished with:
Indeed :) The idea is that the atoms are the lowest level. And we compose the atoms to conform to the WebDriver or RC apis in “javascript/{selenium,webdriver}-atoms” respecitively. And then suck those in as necessary.
A Story of Crazy-Fun
Simon Stewart :
So, let’s go back to the very beginning of the project
When it was me, on my own (the webdriver project, that is, not selenium itself) I knew that I wanted to cover multiple different languages, and so wanted a build tool that could work with all of them That is, that didn't have a built in preference for one that made working with other languages painful ant is java biased. As is maven. nant and msbuild are .net biased rake, otoh, supports nothing very well But, and this is key, any valid rake script is also a valid ruby program It's possible to extend rake to build anything So: rake it was The initial rake file was pretty small and manageable But as the project grew, so did the Rakefile Until there was only person who could deal with it (me), and even then it was pretty shaky So, rather than have a project that couldn't be built, I extracted some helper methods to do some of the heavy lifting Which made the Rakefile comprehensible again But they project kept. getting. bigger And the Rakefile got harder and harder to grok At the time, I was working at Google, who have a wonderful build system Google's system is declarative and works across multiple different languages consistently And, most important, it breaks up the build from a single file into little fragments I asked the OSS chaps at Google if it was okay to open source the build grammar, and they gave it the green light So we layered that build grammar into the selenium codebase With one minor change (we handle dictionary args) But that grammar sits on top of rake still, after all this time And there's a problem And that's that rake is single threaded So our builds are constrained to run serially We could use "multitask" types to improve things, but when I've tried that things got very messy, very fast So, our next hurdle is that crazyfun.rb is slow: we need to go faster Which implies a rewrite of crazyfun I'm most comfortable in java So, I've spiked a new version in java that handles the java and js compilation It's significantly faster But, and this is also important, it's a spike The code was designed to be disposable. Now that things have been proved out, I'd really like to do a clean implementation But I'm torn Do I "finish" the new, very fast crazyfun java enough to replace the ruby version?
A story of driver executeables
jimevans noob_einsteinsfo: alright, story time, then. are we sitting comfortably? then we'll begin. noob_einsteinsfo: back when i first started working on the project (circa 2010), the drivers for all of the browsers were built and maintained by the project. at the time, that meant IE, firefox, and chrome. all of those drivers were packaged as part of the selenium standalone server, and were also packaged in with the various language bindings. this was a conscious decision, so that if one were running locally, there would be no need for the java runtime on the machine just to automate a given browser. there were two factors that led to the development of browser drivers as separate executables. as a quick aside, remember that the webdriver philosophy is to automate the browser using the "best-fit" mechanism for that particular browser. for IE, that means using the COM interfaces; for firefox at the time, that meant using a browser extension; for chrome, it also meant a browser extension. so that meant that the IE driver was developed as a DLL in C++ that was loaded by the language bindings, and communicated with via whatever native-code mechanism was provided by the language (JNI for java, P/Invoke for .NET, ctypes for python, etc.). it also meant that the firefox driver was developed as a browser extension that was packaged inside the various language bindings, and extracted, and used in a profile in firefox. as i said, the IE driver was implemented as a DLL, loaded and communicated with using different mechanisms for different language bindings. the problem is that each of those language-specific mechanisms had different load/unload semantics. ruby, for example, would never call the windows FreeLibrary API after loading the DLL into memory, making multiple instances really challenging. *process* semantics, however, as in, starting, stopping, and managing the lifetime of a process on the OS, whatever the OS, are remarkably similar across all languages. so when the IE driver rewrite was completed in 2010, the development team (me) decided to make it a separate executable, so that the load/unload semantics could be consistent no matter what language bindings one was using. concurrently with this, the chromium team made the decision to follow opera's lead and provide a driver implementation for chrome. an implementation that they would develop, enhance, and maintain going forward, relieving the selenium project of the burden of maintaining a chrome driver.
XgizmoX and that driver is part of the browser?
jimevans XgizmoX: not really, but i believe there may be some smarts built into chrome itself that knows when it's being automated via chromedriver. one of the googlers would be a better person to ask about that. anyway, knowing the different in shared library (.dll/.so/.dynlib) loading semantics, the chromium team (with my encouragement) decided to release their chromedriver implementation as a separate executable. fast-forward a couple of years, and you begin to see the effort to make webdriver a w3c standard. a working group with the w3c created a specification (still in progress, but getting close to finished with the first version), which codified the behavior of webdriver, and how a browser should react to its methods. furthermore, it standardized the protocol used to communicate between language bindings and a driver for a particular browser. i can't emphasize how important and groundbreaking this was. because the w3c and the webdriver working group within it are made up of representatives from the browser vendors themselves, it ensures that the solution will be supported directly by the browser vendors. mozilla created their webdriver implementation (geckodriver) for firefox. the most efficient mechanism for distribution of that browser driver, while maintaining the proper semantics for the language bindings, was to ship as a separate executable. note, this is a gross oversimplification of the geckodriver architecture; the actual executable acts as a relatively thin shim, translating from the wire protocol of the spec to their internal marionette protocol but the point still stands. anyway, the landscape is currently evolving regarding browser-vendor-provided driver implementation. microsoft has one for edge, apple has one for safari (10 and above), the chromium team (largely staffed by googlers) has one for chrome, and now mozilla has one for firefox. given the limited utility of the legacy firefox driver going forward, breaking it out into a separate executable would be wasted effort. this is particularly so, since all of the communication bits that are normally handled by the executable (listening for and responding to http requests from the language bindings) are handled entirely by the browser extension. \ there's literally no need for the legacy firefox driver to be a separate executable. moreover, making it independent of a language runtime would be a significant portion of work (because a .NET shop might reasonably balk at being required to install, say, the java runtime just to automate firefox) so historically speaking, noob-einsteinsfo, that's the general reason for why separate executables have become the norm, and why that paradigm wasn't extended to include the legacy firefox driver. does that make sense? okay. now. about geckodriver. the tale of geckodriver is intimately bound with the status of the aforementioned w3c webdriver spec. level 1 of the specification is mostly done, though it took a number of years of effort to get there. it took a large effort from some very smart people (AutomatedTester among them) to mold the initial documentation of what the webdriver open source software (OSS) project did into proper specification language that could be interpreted and turned into actionable stuff by a browser vendor or other implementor. when beginning the geckodriver (nee marionette) project, mozilla decided to base their implementation on the spec, and only the spec, not following the OSS implementation. this created something of a chicken-and-egg problem, in that while the spec language wasn't completed, it couldn't be implemented. it's only been in the last six months or so that the language concerning the advanced user interactions api (the Actions class in java and .NET) has been made robust enough to actually implement. accordingly, that's the single biggest missing chunk of functionality in geckodriver at present. it wasn't implementable via the spec, so it hasn't been implemented. i do know that it's a very high priority for AutomatedTester and his team to get that implementation done and available. as for why geckodriver is mandatory, and the default implementation for automating firefox in 3.x, that also comes down to some decisions made by mozilla.
TheSchaf so i guess there is no other choice than to use the old FF as long as required features are missing WhereIsMySpoon TheSchaf: if you need those features, yes or use another browser TheSchaf well, moveTo and sendKeys should be pretty basic :p
jimevans TheSchaf: element.sendKeys works just fine. it's Actions.sendKeys that would be broken. in firefox version fortysomething (i misremember the exact version), there was a feature added that blocked browser extensions that hadn't been signed by the mozilla security team. remember that the legacy firefox driver was built as a browser extension? well, with that feature of the browser enabled, the legacy driver couldn't be loaded by the browser. now, for several versions of firefox, it was possible to disable this feature of the browser, and allow unsigned extensions to continue to be loaded. and selenium did this, by virtue of the settings used in the anonymous profile the bindings created when launching firefox. until firefox 48, at which point, it was no longer possible to disable loading of unsigned extensions. at that point, geckodriver was the only way forward for that. now, two more slight points, then i'll be done with story time. first, by nature of what the legacy driver extension does, it's not possible to get it to pass the certification process of the mozilla security team. we asked, were denied, and were told it wouldn't happen ever, full stop. and that's perfectly reasonable, since what that extension does is a security hole big enough to drive a whole fleet of lorries through. second, it turns out there may, in fact, be a way to privately sign the legacy extension so that it can be loaded and used privately by versions of firefox 48 and higher. that's still a less-than-ideal approach, because there's no way that our merry band of open source developers can know how to automate firefox better than the development teams at mozilla, who create the browser in the first place. i totally get the frustration that geckodriver doesn't have the full feature parity of the legacy implementation, especially when it feels like one is being forced to move to it. raging at the selenium project about that decision is directing one's ire in entirely the wrong direction. however, before going off and saying horrible things about mozilla's decisions, do know that mozilla has several people who are constantly engaged in the project, a few of them right here in this very channel (AutomatedTester, davehunt, to name two). i'm sure i've glossed over or mischaracterized some of the historical details of these things, and i'm happy to be corrected. i'm old, after all, and the memory isn't what it used to be. but that, my friends, is the (not so very) short history of why we have separate executables for drivers, and why geckodriver is the way forward, and why a move to it was necessary when the move was made even though some functionality was lacking.
jimevans feels like he's become an unofficial historian of the webdriver project