PHP json encode - Malformed UTF-8 characters, possibly incorrectly encoded

I'm using json_encode($data) to an data array and there's a field contains Russian characters.

I used this mb_detect_encoding() to display what encoding it is for that field and it displays UTF-8.

I think the json encode failed due to some bad characters in it like "ра▒". I tried alot of things utf8_encode on the data and it will by pass that error but then the data doesn't look correct anymore.

What can be done with this issue?


The issue happens if there are some non-utf8 characters inside even though most of them are utf8 chars. This will remove any non-utf8 characters and now it works.

$data['name'] = mb_convert_encoding($data['name'], 'UTF-8', 'UTF-8');

If you have a multidimensional array to encode in JSON format then you can use below function:

If JSON_ERROR_UTF8 occurred :

$encoded = json_encode( utf8ize( $responseForJS ) );

Below function is used to encode Array data recursively

/* Use it for json_encode some corrupt UTF-8 chars
 * useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
 */
function utf8ize( $mixed ) {
    if (is_array($mixed)) {
        foreach ($mixed as $key => $value) {
            $mixed[$key] = utf8ize($value);
        }
    } elseif (is_string($mixed)) {
        return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
    }
    return $mixed;
}

Please, make sure to initiate your Pdo object with the charset iso as utf8. This should fix this problem avoiding any re-utf8izing dance.

$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');