Is there a possibility to address elements on a website which have no ID?
Solution 1:
You will need to use some type of selector.
The GetElementByID method works best because if the HTML file is formatted correctly then there should only be one element with that unique ID.
The GetElementFromPoint will return an element based on the X,Y coordinate of the document, this is best used in the Document's Click event.
The GetElementByTagName name will return a collection of elements and works if you know the tag type of the element, such as <button>...</button>
or <p>...</p>
. To help narrow down which element you want, you will need to then iterate through the returned collection and compare either the element's attributes if you know their respective values or the element's text via the InnerHTML property.
The last and least effective method is the All property which returns every element in the document. The reason why this is the least effective is because at least with GetElementByTagName, you can narrow down the collection based on the tag's name.
However, let's assume that you have the following markup:
<body>
<p>Look at my super complex HTML markup.</p>
<button>Click Me</button>
<button>No, click me!</button>
</body>
You could then get the button tag that says "Click Me" by using the following:
Dim click_me As HtmlElement = WebBrowser1.Document.GetElementByTagName("button").SingleOrDefault(Function(e) e.InnerHtml = "Click Me")
Solution 2:
Seeing as this question is asked every now and then I'll see if I can't try to tackle this once and for all. Here's a more extensive guide on how to find elements that don't have an ID:
- The basics -
There are plenty of built-in properties and methods you can use in order to identify an element. The most common ones include:
-
HtmlElement.GetElementsByTagName()
Method. Returns a collection of all elements in the document/element having the specified HTML tag. This can be called both on a
HtmlElement
but also on theHtmlDocument
itself. -
HtmlElement.GetAttribute()
Method. Returns the value of a specific attribute on the specified
HtmlElement
. -
HtmlElement.InnerHtml
Property. Returns all HTML code located inside the specified element (but not the code for the element itself).
-
HtmlElement.InnerText
Property. Returns all text (stripped from HTML code) located inside the specified element.
-
HtmlElement.OuterHtml
Property. Returns the HTML code located inside the specified element, including the code for the element itself.
These methods and properties can all be used in different ways to identify an element, as illustrated by the examples below.
NOTE: I omitted HtmlElement.OuterText
because its behaviour is a bit odd, and I'm not 100% sure what it actually does.
- Examples of finding elements with no ID -
Following are a set of examples of how you can use the previously mentioned methods and properties in order to find the element you're looking for.
Finding an element by its class(-name)
To find and element based on its class
attribute you have to iterate all elements and check GetAttribute("className")
on each. If you know the element type (tag name) beforehand you can narrow the search by first getting a collection of all the elements of that type using HtmlDocument.GetElementsByTagName()
instead of HtmlDocument.All
.
HTML code:
<div class="header"> <div id="title" class="centerHelper"> <img id="logo" src="img/logo.png"/> </div> <p class="headerContent"> Hello World! </p> </div>
Element to locate:
<p class="headerContent">
VB.NET code:
'Iterate all elements. For Each Element As HtmlElement In WebBrowser1.Document.All If Element.GetAttribute("className") = "headerContent" Then 'Found. Do something with 'Element'... Exit For 'Stop looping. End If Next
Finding an element based on an attribute, located inside another element (with ID)
In order to find a child element based on one of its attributes, where the child is located inside a parent element (that has an ID) you simply need to get the parent element by its ID and then iterate all its children.
HTML code:
<select id="items" class="itemsList"> <option value="2">Apple</option> <option value="3">Orange</option> <option value="5">Banana</option> </select>
Element to locate:
<option value="5">Banana</option>
VB.NET code:
'Iterate all children of the element with ID "items". For Each Element As HtmlElement In WebBrowser1.Document.GetElementByID("items").Children If Element.getAttribute("value") = "5" Then 'Found. Do something with 'Element'... Exit For 'Stop looping. End If Next
Finding an element based on an attribute, located inside another element (without ID)
To find a child element based on one of its attributes, where the child is located inside a parent element (that doesn't have an ID) you first have to create an outer loop that looks for the parent element. Then, when found, you can start iterating the children.
HTML code:
<select class="itemsList"> <option value="2">Apple</option> <option value="3">Orange</option> <option value="5">Banana</option> </select>
Element to locate:
<option value="5">Banana</option>
VB.NET code:
'Variable keeping track of whether we found the element we're looking for or not. Dim ElementFound As Boolean = False 'Outer loop, looking for the parent object (<select class="itemsList">). For Each Element As HtmlElement In WebBrowser1.Document.GetElementsByTagName("select") 'Iterate all <select> tags. You can use Document.All here as well. If Element.getAttribute("className") = "itemsList" Then 'Parent found. 'Inner loop, looking for the child element we want (<option value="5">Banana</option>). For Each OptionElement As HtmlElement In Element.GetElementsByTagName("option") If OptionElement.GetAttribute("value") = "5" Then 'Found. Do something with 'OptionElement'... ElementFound = True Exit For 'Exit the inner loop. End If Next 'Exit the outer loop if we found the element we're looking for. If ElementFound Then Exit For End If Next
Finding an element based on its InnerText
In some cases the element you want to locate doesn't have any attributes or is simply too similar to a lot of other elements on the site. In this case, if its contents are always the same you can identify it via its InnerText
or InnerHtml
properties.
HTML code:
<h1>Important information</h1> <p>Please read this information through <b>carefully</b> before continuing.</p> <h2>Copyrighted material<h2> <p>All material (text, images, video, etc.) on this site are <b>copyrighted</b> to COMPANY NAME.</p>
Element to locate:
<h2>Copyrighted material<h2>
VB.NET code:
For Each Element As HtmlElement In WebBrowser.Document.All If Element.InnerText = "Copyrighted material" Then 'Found. Do something with 'Element'... Exit For 'Stop looping. End If Next
Finding an element based on its InnerHtml
Finding an element based on its InnerHtml
works exactly the same way as when you look based on its InnerText
apart from that the string you're checking now also includes HTML code.
HTML code:
<h1>Important information</h1> <p>Please read this information through <b>carefully</b> before continuing.</p> <h2>Copyrighted material<h2> <p>All material (text, images, video, etc.) on this site are <b>copyrighted</b> to COMPANY NAME.</p>
Element to locate:
<p>All material (text, images, video, etc.) on this site are <b>copyrighted</b> to COMPANY NAME.</p>
VB.NET code:
'Iterate all <p> tags. For Each Element As HtmlElement In WebBrowser.Document.GetElementsByTagName("p") If Element.InnerHtml.Contains("<b>copyrighted</b>") Then 'Found. Do something with 'Element'... Exit For 'Stop looping. End If Next