How to remove html special chars?
Either decode them using html_entity_decode
or remove them using preg_replace
:
$Content = preg_replace("/&#?[a-z0-9]+;/i","",$Content);
(From here)
EDIT: Alternative according to Jacco's comment
might be nice to replace the '+' with {2,8} or something. This will limit the chance of replacing entire sentences when an unencoded '&' is present.
$Content = preg_replace("/&#?[a-z0-9]{2,8};/i","",$Content);
Use html_entity_decode
to convert HTML entities.
You'll need to set charset to make it work correctly.
In addition to the good answers above, PHP also has a built-in filter function that is quite useful: filter-var.
To remove HMTL characters, use:
$cleanString = filter_var($dirtyString, FILTER_SANITIZE_STRING);
More info:
- function.filter-var
- filter_sanitize_string
You may want take a look at htmlentities() and html_entity_decode() here
$orig = "I'll \"walk\" the <b>dog</b> now";
$a = htmlentities($orig);
$b = html_entity_decode($a);
echo $a; // I'll "walk" the <b>dog</b> now
echo $b; // I'll "walk" the <b>dog</b> now