How to get all input elements in a form with HtmlAgilityPack without getting a null reference error
You can do the following:
HtmlNode.ElementsFlags.Remove("form");
HtmlDocument doc = new HtmlDocument();
doc.Load(@"D:\test.html");
HtmlNode secondForm = doc.GetElementbyId("form2");
foreach (HtmlNode node in secondForm.Elements("input"))
{
HtmlAttribute valueAttribute = node.Attributes["value"];
if (valueAttribute != null)
{
Console.WriteLine(valueAttribute.Value);
}
}
By default HTML Agility Pack parses forms as empty node because they are allowed to overlap other HTML elements. The first line, (HtmlNode.ElementsFlags.Remove("form");
) disables this behavior allowing you to get the input elements inside the second form.
Update: Example of form elements overlap:
<table>
<form>
<!-- Other elements -->
</table>
</form>
The element begins inside a table but is closed outside the table element. This is allowed in the HTML specification and HTML Agility Pack has to deal with it.