What HTML parsing libraries do you recommend in Java [closed]

I want to parse some HTML in order to find the values of some attributes/tags etc.

What HTML parsers do you recommend? Any pros and cons?


Solution 1:

NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.

Solution 2:

I have tried HTML Parser which is dead simple.

Solution 3:

Do you need to do a full parse of the HTML? If you're just looking for specific values within the contents (a specific tag/param), then a simple regular expression might be enough, and could very well be faster.