Jay Abhani
Senior Web Development Instructor at almaBetter
Master XPath with this ultimate XPath Cheat Sheet! Learn syntax, locators, axes, queries, and examples for Selenium, XML, and web scraping; all in one place
XPath (XML Path Language) is a versatile and powerful tool used to navigate through and extract data from XML and HTML documents. This cheat sheet covers everything from basic XPath syntax to advanced techniques for Selenium automation, web scraping, and XML parsing.
XPath is a query language designed to extract and locate information within an XML or HTML document. It is the backbone of many automation and web scraping tools like Selenium, Puppeteer, and BeautifulSoup.
XPath expressions define paths to navigate through a document's nodes. The two primary types of paths are:
Absolute XPath: Starts from the root node.
/html/body/div |
Relative XPath: Starts from the current node or anywhere in the document.
//div[@class='example'] |
Select all nodes of a type:
//tagname |
Select by attribute:
//tagname[@attribute='value'] |
Select by partial text:
//*[contains(text(), 'partial')] |
Select using functions:
//*[starts-with(@attribute, 'value')] |
Examples:
Select all div elements:
//div |
Select a tags with href attributes:
//a[@href] |
Locators help identify elements in a document. XPath locators are essential for automation frameworks like Selenium.
//*[@id='elementID'] |
//*[@class='className'] |
//*[text()='Sample Text'] |
Examples:
Locate a button by text:
//button[text()='Submit'] |
Locate an element with multiple conditions:
//input[@type='text' and @name='username'] |
Axes define relationships between nodes. Common axes include:
child::tagname |
parent::tagname |
following-sibling::tagname |
Examples:
Select the first child:
/html/body/div/child::p |
Select all ancestors:
ancestor::* |
Feature | XPath | CSS Selectors |
Navigate backward | Yes | No |
Complex Conditions | Yes | Limited |
Examples
XPath:
//div[@id='example'] |
CSS:
div#example |
Use contains for dynamic content:
//*[contains(@class, 'dynamic')] |
Extract product prices:
//span[@class='price'] |
//a |
//input[@type='text'] |
Combine multiple conditions:
//div[@class='class1' and contains(@id, 'partial')] |
driver.find_element_by_xpath("//div[@id='example']") |
driver.find_elements_by_xpath("//a") |
//div[@class='example'] |
div.example |
Use local-name() for namespace-agnostic queries:
//*[local-name()='tagname'] |
XPath Injection occurs when user input is unsanitized and directly included in XPath queries.
//user[username='input' and password='input'] |
This XPath cheat sheet serves as a comprehensive guide for navigating and querying XML/HTML documents effectively. Whether you're using XPath for Selenium automation, web scraping, or XML processing, the concepts, examples, and techniques discussed here will help you master XPath quickly.
More Cheat Sheets and Top Picks