How to sanitize HTML code to prevent XSS attacks in Java or JSP?

I'd recommend using Jsoup for this. Here's an extract of relevance from its site.

Sanitize untrusted HTML

Problem

You want to allow untrusted users to supply HTML for output on your website (e.g. as comment submission). You need to clean this HTML to avoid cross-site scripting (XSS) attacks.

Solution

Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.

String unsafe = 
      "<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
      // now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>

Jsoup offers more advantages than that as well. See also Pros and Cons of HTML parsers in Java.


You should use AntiSamy. (That's what I did)


If none of the ready-made options seem like enough, there is an excellent series of articles on XSS and attack prevention at Google Code. It should provide plenty of information to work with, if you end up going down that path.