PHP messing with HTML Charset Encoding
I have this very strange problem. I have a site that contains some German letters and when it's only html without php the symbols are property displayed with encoding when i change it to UTF-8 they dont display and instead of Ö I get �. When I put the html inside php and start it with Zend studio on Wamp with the charset=iso-8859-1 encoding I get � instead of Ö ( I want to add that this same Ö is a value of a radio button). When it's in a
tag it displays properly. Can you tell me how to fix this issue. I look at other sites and they have UTF-8 Encoding and displaying properly the same symbol. I tried to change the php edior encoding but it doesn't matter I suppose -> everything is displaying properly inside Zend Studio's editor... Thank you in advance.
Solution 1:
You have probably come to mix encoding types. For example. A page that is sent as iso-8859-1, but get UTF-8 text encoding from MySQL or XML would typically fail.
To solve this problem you must keep control on input ecodings type in relation to the type of encoding you have chosen to use internal.
If you send it as an iso-8859-1, your input from the user is also iso-8859-1.
header("Content-type:text/html; charset: iso-8859-1");
And if mysql sends latin1 you do not have to do anything.
But if your input is not iso-8859-1 you must converted it, before it's sending to the user or to adapt it to Mysql before it's store.
mb_convert_encoding($text, mb_internal_encoding(), 'UTF-8'); // If it's UTF-8 to internal encoding
Short it means that you must always have input converted to fit internal encoding and convereter output to match the external encoding.
This is the internal encoding I have chosen to use.
mb_internal_encoding('iso-8859-1'); // Internal encoding
This is a code i use.
mb_language('uni'); // Mail encoding
mb_internal_encoding('iso-8859-1'); // Internal encoding
mb_http_output('pass'); // Skip
function convert_encoding($text, $from_code='', $to_code='')
{
if (empty($from_code))
{
$from_code = mb_detect_encoding($text, 'auto');
if ($from_code == 'ASCII')
{
$from_code = 'iso-8859-1';
}
}
if (empty($to_code))
{
return mb_convert_encoding($text, mb_internal_encoding(), $from_code);
}
return mb_convert_encoding($text, $to_code, $from_code);
}
function encoding_html($text, $code='')
{
if (empty($code))
{
return htmlentities($text, ENT_NOQUOTES, mb_internal_encoding());
}
return mb_convert_encoding(htmlentities($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
function decoding_html($text, $code='')
{
if (empty($code))
{
return html_entity_decode($text, ENT_NOQUOTES, mb_internal_encoding());
}
return mb_convert_encoding(html_entity_decode($text, ENT_NOQUOTES, $code), mb_internal_encoding(), $code);
}
Solution 2:
Can you check what is the value of HTTP header Charset in Response Headers. Though the information is old(2009), i don't know if it still holds: the default charset in PHP is UTF-8 if you don't provide the content-type header with charset. Source
Hence set the header explicitly:
header("Content-type:text/html; charset: iso-8859-1");