What is the difference between cssSelector & Xpath and which is better with respect to performance for cross browser testing?

I am working with the Selenium WebDriver 2.25.0 on multilingual web application & mainly test the page content (For different languages like Arabic, English, Russian & so on).

For my application which is better according to performance & make sure that it should be support for all the browsers (i.e. IE 7,8,9, FF, Chrome etc).

Thanks in advance for your valueable suggestions.


Solution 1:

CSS selectors perform far better than Xpath and it is well documented in Selenium community. Here are some reasons,

  • Xpath engines are different in each browser, hence make them inconsistent
  • IE does not have a native xpath engine, therefore selenium injects its own xpath engine for compatibility of its API. Hence we lose the advantage of using native browser features that WebDriver inherently promotes.
  • Xpath tend to become complex and hence make hard to read in my opinion

However there are some situations where, you need to use xpath, for example, searching for a parent element or searching element by its text (I wouldn't recommend the later).

You can read blog from Simon here . He also recommends CSS over Xpath.

If you are testing content then do not use selectors that are dependent on the content of the elements. That will be a maintenance nightmare for every locale. Try talking with developers and use techniques that they used to externalize the text in the application, like dictionaries or resource bundles etc. Here is my blog that explains it in detail.

edit 1

Thanks to @parishodak, here is the link which provides the numbers proving that CSS performance is better

Solution 2:

I’m going to hold the unpopular on SO selenium tag opinion that XPath is preferable to CSS in the longer run.

This long post has two sections - first I'll put a back-of-the-napkin proof the performance difference between the two is 0.1-0.3 milliseconds (yes; that's 100 microseconds), and then I'll share my opinion why XPath is more powerful.


Performance difference

Let's first tackle "the elephant in the room" – that xpath is slower than css.

With the current cpu power (read: anything x86 produced since 2013), even on browserstack/saucelabs/aws VMs, and the development of the browsers (read: all the popular ones in the last 5 years) that is hardly the case. The browser's engines have developed, the support of xpath is uniform, IE is out of the picture (hopefully for most of us). This comparison in the other answer is being cited all over the place, but it is very contextual – how many are running – or care about – automation against IE8?

If there is a difference, it is in a fraction of a millisecond.

Yet, most higher-level frameworks add at least 1ms of overhead over the raw selenium call anyways (wrappers, handlers, state storing etc); my personal weapon of choice – RobotFramework – adds at least 2ms, which I am more than happy to sacrifice for what it provides. A network roundtrip from an AWS us-east-1 to BrowserStack's hub is usually 11 milliseconds.

So with remote browsers if there is a difference between xpath and css, it is overshadowed by everything else, in orders of magnitude.


The measurements

There are not that many public comparisons (I've really seen only the cited one), so – here's a rough single-case, dummy and simple one.
It will locate an element by the two strategies X times, and compare the average time for that.

The target – BrowserStack's landing page, and its "Sign Up" button; a screenshot of the html as writing this post:

enter image description here

Here's the test code (python):

from selenium import webdriver
import timeit


if __name__ == '__main__':

    xpath_locator = '//div[@class="button-section col-xs-12 row"]'
    css_locator = 'div.button-section.col-xs-12.row'

    repetitions = 1000

    driver = webdriver.Chrome()
    driver.get('https://www.browserstack.com/')

    css_time = timeit.timeit("driver.find_element_by_css_selector(css_locator)", 
                             number=repetitions, globals=globals())
    xpath_time = timeit.timeit('driver.find_element_by_xpath(xpath_locator)', 
                             number=repetitions, globals=globals())

    driver.quit()

    print("css total time {} repeats: {:.2f}s, per find: {:.2f}ms".
          format(repetitions, css_time, (css_time/repetitions)*1000))
    print("xpath total time for {} repeats: {:.2f}s, per find: {:.2f}ms".
          format(repetitions, xpath_time, (xpath_time/repetitions)*1000))

For those not familiar with Python – it opens the page, and finds the element – first with the css locator, then with the xpath; the find operation is repeated 1,000 times. The output is the total time in seconds for the 1,000 repetitions, and average time for one find in milliseconds.

The locators are:

  • for xpath – "a div element having this exact class value, somewhere in the DOM";
  • the css is similar – "a div element with this class, somewhere in the DOM".

Deliberately chosen not to be over-tuned; also, the class selector is cited for the css as "the second fastest after an id".

The environment – Chrome v66.0.3359.139, chromedriver v2.38, cpu: ULV Core M-5Y10 usually running at 1.5GHz (yes, a "word-processing" one, not even a regular i7 beast).

Here's the output:

css total time 1000 repeats: 8.84s, per find: 8.84ms

xpath total time for 1000 repeats: 8.52s, per find: 8.52ms

Obviously the per find timings are pretty close; the difference is 0.32 milliseconds. Don't jump "the xpath is faster" – sometimes it is, sometimes it's css.


Let's try with another set of locators, a tiny-bit more complicated – an attribute having a substring (common approach at least for me, going after an element's class when a part of it bears functional meaning):

xpath_locator = '//div[contains(@class, "button-section")]'
css_locator = 'div[class~=button-section]'

The two locators are again semantically the same – "find a div element having in its class attribute this substring".
Here are the results:

css total time 1000 repeats: 8.60s, per find: 8.60ms

xpath total time for 1000 repeats: 8.75s, per find: 8.75ms

Diff of 0.15ms.


As an exercise - the same test as done in the linked blog in the comments/other answer - the test page is public, and so is the testing code.

They are doing a couple of things in the code - clicking on a column to sort by it, then getting the values, and checking the UI sort is correct.
I'll cut it - just get the locators, after all - this is the root test, right?

The same code as above, with these changes in:

  • The url is now http://the-internet.herokuapp.com/tables; there are 2 tests.

  • The locators for the first one - "Finding Elements By ID and Class" - are:

css_locator = '#table2 tbody .dues'
xpath_locator = "//table[@id='table2']//tr/td[contains(@class,'dues')]"

And here is the outcome:

css total time 1000 repeats: 8.24s, per find: 8.24ms

xpath total time for 1000 repeats: 8.45s, per find: 8.45ms

Diff of 0.2 milliseconds.

The "Finding Elements By Traversing":

css_locator = '#table1 tbody tr td:nth-of-type(4)'
xpath_locator = "//table[@id='table1']//tr/td[4]"

The result:

css total time 1000 repeats: 9.29s, per find: 9.29ms

xpath total time for 1000 repeats: 8.79s, per find: 8.79ms

This time it is 0.5 ms (in reverse, xpath turned out "faster" here).

So 5 years later (better browsers engines) and focusing only on the locators performance (no actions like sorting in the UI, etc), the same testbed - there is practically no difference between CSS and XPath.


So, out of xpath and css, which of the two to choose for performance? The answer is simple – choose locating by id.

Long story short, if the id of an element is unique (as it's supposed to be according to the specs), its value plays an important role in the browser's internal representation of the DOM, and thus is usually the fastest.

Yet, unique and constant (e.g. not auto-generated) ids are not always available, which brings us to "why XPath if there's CSS?"


The XPath advantage

With the performance out of the picture, why do I think xpath is better? Simple – versatility, and power.

Xpath is a language developed for working with XML documents; as such, it allows for much more powerful constructs than css.
For example, navigation in every direction in the tree – find an element, then go to its grandparent and search for a child of it having certain properties.
It allows embedded boolean conditions – cond1 and not(cond2 or not(cond3 and cond4)); embedded selectors – "find a div having these children with these attributes, and then navigate according to it".
XPath allows searching based on a node's value (its text) – however frowned upon this practice is, it does come in handy especially in badly structured documents (no definite attributes to step on, like dynamic ids and classes - locate the element by its text content).

The stepping in css is definitely easier – one can start writing selectors in a matter of minutes; but after a couple of days of usage, the power and possibilities xpath has quickly overcomes css.
And purely subjective – a complex css is much harder to read than a complex xpath expression.

Outro ;)

Finally, again very subjective - which one to chose?

IMO, there is no right or wrong choice - they are different solutions to the same problem, and whatever is more suitable for the job should be picked.

Being "a fan" of XPath I'm not shy to use in my projects a mix of both - heck, sometimes it is much faster to just throw a CSS one, if I know it will do the work just fine.

Solution 3:

The debate between cssSelector vs XPath would remain as one of the most subjective debate in the Selenium Community. What we already know so far can be summarized as:

  • People in favor of cssSelector say that it is more readable and faster (especially when running against Internet Explorer).
  • While those in favor of XPath tout it's ability to transverse the page (while cssSelector cannot).
  • Traversing the DOM in older browsers like IE8 does not work with cssSelector but is fine with XPath.
  • XPath can walk up the DOM (e.g. from child to parent), whereas cssSelector can only traverse down the DOM (e.g. from parent to child)
  • However not being able to traverse the DOM with cssSelector in older browsers isn't necessarily a bad thing as it is more of an indicator that your page has poor design and could benefit from some helpful markup.
  • Ben Burton mentions you should use cssSelector because that's how applications are built. This makes the tests easier to write, talk about, and have others help maintain.
  • Adam Goucher says to adopt a more hybrid approach -- focusing first on IDs, then cssSelector, and leveraging XPath only when you need it (e.g. walking up the DOM) and that XPath will always be more powerful for advanced locators.

Dave Haeffner carried out a test on a page with two HTML data tables, one table is written without helpful attributes (ID and Class), and the other with them. I have analyzed the test procedure and the outcome of this experiment in details in the discussion Why should I ever use cssSelector selectors as opposed to XPath for automated testing?. While this experiment demonstrated that each Locator Strategy is reasonably equivalent across browsers, it didn't adequately paint the whole picture for us. Dave Haeffner in the other discussion Css Vs. X Path, Under a Microscope mentioned, in an an end-to-end test there were a lot of other variables at play Sauce startup, Browser start up, and latency to and from the application under test. The unfortunate takeaway from that experiment could be that one driver may be faster than the other (e.g. IE vs Firefox), when in fact, that's wasn't the case at all. To get a real taste of what the performance difference is between cssSelector and XPath, we needed to dig deeper. We did that by running everything from a local machine while using a performance benchmarking utility. We also focused on a specific Selenium action rather than the entire test run, and run things numerous times. I have analyzed the specific test procedure and the outcome of this experiment in details in the discussion cssSelector vs XPath for selenium. But the tests were still missing one aspect i.e. more browser coverage (e.g., Internet Explorer 9 and 10) and testing against a larger and deeper page.

Dave Haeffner in another discussion Css Vs. X Path, Under a Microscope (Part 2) mentions, in order to make sure the required benchmarks are covered in the best possible way we need to consider an example that demonstrates a large and deep page.


Test SetUp

To demonstrate this detailed example, a Windows XP virtual machine was setup and Ruby (1.9.3) was installed. All the available browsers and their equivalent browser drivers for Selenium was also installed. For benchmarking, Ruby's standard lib benchmark was used.


Test Code

require_relative 'base'
require 'benchmark'

class LargeDOM < Base

  LOCATORS = {
    nested_sibling_traversal: {
      css: "div#siblings > div:nth-of-type(1) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3) > div:nth-of-type(3)",
      xpath: "//div[@id='siblings']/div[1]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]/div[3]"
    },
    nested_sibling_traversal_by_class: {
      css: "div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1 > div.item-1",
      xpath: "//div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]/div[contains(@class, 'item-1')]"
    },
    table_header_id_and_class: {
      css: "table#large-table thead .column-50",
      xpath: "//table[@id='large-table']//thead//*[@class='column-50']"
    },
    table_header_id_class_and_direct_desc: {
      css: "table#large-table > thead .column-50",
      xpath: "//table[@id='large-table']/thead//*[@class='column-50']"
    },
    table_header_traversing: {
      css: "table#large-table thead tr th:nth-of-type(50)",
      xpath: "//table[@id='large-table']//thead//tr//th[50]"
    },
    table_header_traversing_and_direct_desc: {
      css: "table#large-table > thead > tr > th:nth-of-type(50)",
      xpath: "//table[@id='large-table']/thead/tr/th[50]"
    },
    table_cell_id_and_class: {
      css: "table#large-table tbody .column-50",
      xpath: "//table[@id='large-table']//tbody//*[@class='column-50']"
    },
    table_cell_id_class_and_direct_desc: {
      css: "table#large-table > tbody .column-50",
      xpath: "//table[@id='large-table']/tbody//*[@class='column-50']"
    },
    table_cell_traversing: {
      css: "table#large-table tbody tr td:nth-of-type(50)",
      xpath: "//table[@id='large-table']//tbody//tr//td[50]"
    },
    table_cell_traversing_and_direct_desc: {
      css: "table#large-table > tbody > tr > td:nth-of-type(50)",
      xpath: "//table[@id='large-table']/tbody/tr/td[50]"
    }
  }

  attr_reader :driver

  def initialize(driver)
    @driver = driver
    visit '/large'
    is_displayed?(id: 'siblings')
    super
  end

  # The benchmarking approach was borrowed from
  # http://rubylearning.com/blog/2013/06/19/how-do-i-benchmark-ruby-code/
  def benchmark
    Benchmark.bmbm(27) do |bm|
      LOCATORS.each do |example, data|
    data.each do |strategy, locator|
      bm.report(example.to_s + " using " + strategy.to_s) do
        begin
          ENV['iterations'].to_i.times do |count|
         find(strategy => locator)
          end
        rescue Selenium::WebDriver::Error::NoSuchElementError => error
          puts "( 0.0 )"
        end
      end
    end
      end
    end
  end

end

Results

NOTE: The output is in seconds, and the results are for the total run time of 100 executions.

In Table Form:

css_xpath_under_microscopev2

In Chart Form:

  • Chrome:

chart-chrome

  • Firefox:

chart-firefox

  • Internet Explorer 8:

chart-ie8

  • Internet Explorer 9:

chart-ie9

  • Internet Explorer 10:

chart-ie10

  • Opera:

chart-opera


Analyzing the Results

  • Chrome and Firefox are clearly tuned for faster cssSelector performance.
  • Internet Explorer 8 is a grab bag of cssSelector that won't work, an out of control XPath traversal that takes ~65 seconds, and a 38 second table traversal with no cssSelector result to compare it against.
  • In IE 9 and 10, XPath is faster overall. In Safari, it's a toss up, except for a couple of slower traversal runs with XPath. And across almost all browsers, the nested sibling traversal and table cell traversal done with XPath are an expensive operation.
  • These shouldn't be that surprising since the locators are brittle and inefficient and we need to avoid them.

Summary

  • Overall there are two circumstances where XPath is markedly slower than cssSelector. But they are easily avoidable.
  • The performance difference is slightly in favor of css-selectors for non-IE browsers and slightly in favor of xpath for IE browsers.

Trivia

You can perform the bench-marking on your own, using this library where Dave Haeffner wrapped up all the code.