What HTML parsing libraries do you recommend in Java [closed]
I want to parse some HTML in order to find the values of some attributes/tags etc.
What HTML parsers do you recommend? Any pros and cons?
Solution 1:
NekoHTML, TagSoup, and JTidy will allow you to parse HTML and then process with XML tools, like XPath.
Solution 2:
I have tried HTML Parser which is dead simple.
Solution 3:
Do you need to do a full parse of the HTML? If you're just looking for specific values within the contents (a specific tag/param), then a simple regular expression might be enough, and could very well be faster.