htmlspecialchars vs htmlentities when concerned with XSS
Solution 1:
htmlspecialchars() will NOT protect you against UTF-7 XSS exploits, that still plague Internet Explorer, even in IE 9: http://securethoughts.com/2009/05/exploiting-ie8-utf-7-xss-vulnerability-using-local-redirection/
For instance:
<?php
$_GET['password'] = 'asdf&ddddd"fancy˝quotes˝';
echo htmlspecialchars($_GET['password'], ENT_COMPAT | ENT_HTML401, 'UTF-8') . "\n";
// Output: asdf&ddddd"fancyË
echo htmlentities($_GET['password'], ENT_COMPAT | ENT_HTML401, 'UTF-8') . "\n";
// Output: asdf&ddddd"fancyËquotes
You should always use htmlentities and very rarely use htmlspecialchars when sanitizing user input. ALso, you should always strip tags before. And for really important and secure sites, you should NEVER trust strip_tags(). Use HTMLPurifier for PHP.
Solution 2:
If PHP's header
command is used to set the charset
header('Content-Type: text/html; charset=utf-8');
then htmlspecialchars
and htmlentities
should both be safe for output of HTML because XSS cannot then be achieved using UTF-7 encodings.
Please note that these functions should not be used for output of values into JavaScript or CSS, because it would be possible to enter characters that enable the JavaScript or CSS to be escaped and put your site at risk. Please see the XSS Prevention Cheat Sheet on how to appropriately handle these situations.