Xpath for you and me

#xpath

I learned XPath a few years ago and always found myself frustrated with the documentation for it. There were a few basic concepts that seemed to trip me up on it as I learned it. My hope with this brief article is that I can make it a little easier for the next person who needs to learn this small but mighty tool.

What is XPath?

XPath stands for XML Path Language

XPath is designed to be used to point to parts of an XML document. We use it to do pattern matching between DOM nodes. It is used in XSLT, Selenium and other areas where DOM navigation is useful.

When looking at the syntax of an xpath query, view it as if the DOM is a file hierarchy that we are navigating, similar to URL paths. It intuitively makes a bit more sense that way. Each parent element is a "folder" that can contain other folders (child elements).

The general syntax is similar to regex and CSS selectors as well.

XPath query structure

XPath queries are made up of four parts.

The prefix determines the starting point of the query.
The axis refers to the relationship of the context node.
The step is also the context node, the identifier of the element we’re referencing.
The predicate makes the step more specific

Note: The less specific XPath queries are the more expensive they become, performance-wise. Similar to CSS selectors, there is a balance between specificity vs. flexibility and performance.

Parts of an XPath query
//	ul	/	a[@id ='link']
Prefix	Step	Axis	Step with predicate

Axis selector examples

Axis selectors allow us to "drill down" into the structure we're processing to access the node we're looking for.

Axis selector	Examples	Context
`//`	`/section/div//a`	Anywhere in the document when prefix (This will set the context to any descendent element)
`./`	`./a`	Child relative to the current node
`/`	`/html/body/div`	Start at the root (This will also select the context to any child element)
`.`	`.`	Self node
`..`	`..`	Parent node
`*`	`./*`	Any node

Navigation

XPath also allows you to navigate up and down the hierarchy of the DOM, just like with folder navigation.

Selectors can be chained and can include some limited logic. They are based on various pattern matching criteria, similar to regex.

relationship (child, sibling, preceding, self)
attributes (id, class name, href)
order (first, last)
content (contains string “xyz”)

Selector examples

Example	Context
`//ul/li/a`	Relationship selector, matches a direct child relationship
`//input[@type="submit"]`	Attribute selector
`//ul/li[2]`	Order selector, selects second child `<li>`. Note: this is not zero indexed.
`//button[contains(text(),"Go")]`	Contains text, in this case matching a substring
`//a[@name or @href]`	Or logic
`//ul/li/../../.`	Selects the parent of the `<ul>` (for example, a `<div><ul><li> </li></ul></div>` structure)
`//h1[not(@id)]`	Not selector. This example selects any `<h1>` without an id
`./a[1][@href='/']`	An example of chaining. Here we’re selecting the first `<a href=”/”>` within the current context

A note about contains(). This selector is rather loose and will select any string that contains the string parameter that is passed to it. This can cause unexpected results. In the example above, any button with the string Go in it will be selected, in this case Go Home and Go to Next Page would both be selected. Combining the various selectors can produce the results you seek.