Why is this PHP call to json_encode silently failing - inability to handle single quotes?

I have a stdClass object called $post that, when dumped via print_r(), returns the following:

stdClass Object (
    [ID] => 12981
    [post_title] => Alumnus' Dinner Coming Soon
    [post_parent] => 0
    [post_date] => 2012-01-31 12:00:51
)

Echoing the result from calling json_encode() on this object results in the following:

{
    "ID": "12981",
    "post_title": null,
    "post_parent": "0",
    "post_date": "2012-01-31 12:00:51"
}

I'm assuming that something with the single quote is causing json_encode to choke, but I don't know what format is needed to escape that. Any ideas?

EDIT: Fixed mismatch in code examples. I'm running PHP version 5.3.8

EDIT2: Directly after encoding the object, I did this:

echo json_last_error() == JSON_ERROR_UTF8;

This printed 1, which means that the following error occurred: "Malformed UTF-8 characters, possibly incorrectly encoded". json_last_error()

EDIT3: Calling utf8_decode() on the post title resulted in the following: "Alumnus? Dinner Coming Soon". This data is being pulled from a MySQL database - the post title in particular is a text field, UTF-8 encoded. Maybe this single-quote is improperly encoded? The thing is, I have a SQL GUI app, and it appears correctly in that.


Solution 1:

You need to set the connection encoding before executing queries. How this is done depends on the API you are using to connect:

  • call mysql_set_charset("utf8") if you use the old, deprecated API.
  • call mysqli_set_charset("utf8") if you use mysqli
  • add the charset parameter to the connection string if you use PDO and PHP >= 5.3.6. In earlier versions you need to execute SET NAMES utf8.

When you obtain data from MySQL any text will be encoded in "client encoding", which is likely windows-1252 if you don't configure it otherwise. The character that is causing your problem is the "curly quote", seen as 92 in the hex dump, which confirms that the mysql client is encoding text in windows-1252.

Another thing you might consider is pass all text through utf8_encode, but in this case it wouldn't produce the correct result. PHP's utf8_encode converts iso-8859-1-encoded text. In this encoding \x92 is a non-printable control character, which would be converted into a non-printable control character in utf-8. You could use str_replace("\x92", "'", $input) to fix the problem for this particular character, but if there's any chance there will be any other non-ascii characters in the database you'll want to have the client use UTF-8.

Solution 2:

What I've had to do in the past to json_encode on text with utf8 characters is

json_encode( utf8_encode( $s ) );

and in some cases

json_encode( htmlspecialchars( utf8_encode( $s ) ) );

the utf8_encode() to handle special characters (note, that's Encode, not Decode)

the htmlspecialchars() depending on how you mean to use the JSON string, you may be able to leave this out

and finally, json_encode() to get your JSON packet.

Since you want to json_encode an object, you would need to call utf8_encode() on each text part first, or write a simple recursive utf8_encode(). For your example case, this would do:

function myEncode($o) {
    $o->title = utf8_encode($o->title);
    return json_encode($o);
}

Solution 3:

I would like to refer you about this issue, on link I suggest you use a json_encode wrapper like this :

function safe_json_encode($value){
    if (version_compare(PHP_VERSION, '5.4.0') >= 0) {
        $encoded = json_encode($value, JSON_PRETTY_PRINT);
    } else {
        $encoded = json_encode($value);
    }
    switch (json_last_error()) {
        case JSON_ERROR_NONE:
            return $encoded;
        case JSON_ERROR_DEPTH:
            return 'Maximum stack depth exceeded'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_STATE_MISMATCH:
            return 'Underflow or the modes mismatch'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_CTRL_CHAR:
            return 'Unexpected control character found';
        case JSON_ERROR_SYNTAX:
            return 'Syntax error, malformed JSON'; // or trigger_error() or throw new Exception()
        case JSON_ERROR_UTF8:
            $clean = utf8ize($value);
            return safe_json_encode($clean);
        default:
            return 'Unknown error'; // or trigger_error() or throw new Exception()
    }
}


function utf8ize($mixed) {
    if (is_array($mixed)) {
        foreach ($mixed as $key => $value) {
            $mixed[$key] = utf8ize($value);
        }
    } else if (is_string ($mixed)) {
        return utf8_encode($mixed);
    }
    return $mixed;
}

And after define these function you can use it direct,

echo safe_json_encode($response);

Solution 4:

i was having the same issue while JSON encoding a php array from an ODBC query results, my Server's OBC is configured with 'en_US.819', is a production server so there is no way i can even touch that!!.

when i tried:

echo json_encode($GLOBALS['response'], true);

Where 'respose' is an array with the results, it works as intended as long no bizarre char is present, if so, json_encode fails returning empty.

The solution?.... UTF encode results while fetching the rows from query:

$result = odbc_exec($conn, $sql_query);
$response = array();
while( $row = odbc_fetch_array($result) ) { 
     $json['pers_identificador'] = $row['pers_identificador'];
     $json['nombre_persona'] = utf8_encode( $row['nombre_persona'] );
     $json['nombre_1'] = utf8_encode($row['nombre_1'] );
     $json['nombre_2'] = utf8_encode($row['nombre_2'] );
     array_push($response, $json); 
  }

Now json_encode works!!!, the resulting string is something like this:

{"page_id":300,"max_rows":"100","cant_rows":"12897","datos":
  [{"pers_identificador":"301","cedula":"15250068","interno_1":"178202","interno_2":"","nombre_persona":"JOSE JUAN PANDOLFO ZAGORODKO","nombre_1":"JOSE","nombre_2":"JUAN",....

That fixed my issue.