UTF8 character decoding in Objective C
I am trying to parse a feed from a json webservice on my iPhone however the utf8 conversion is not working the way it should or am I doing something wrong
Here is apart of the feed est un r\u00c3\u00aave en noir
Here is the code I have written to convert the above:
NSString* str1 = @"est un r \u00c3\u00aa ve en noir";
NSString* str = [NSString stringWithUTF8String:[str1 cStringUsingEncoding:NSUTF8StringEncoding]];
NSLog(@"Converted UTF %@",str);
This is the output that I am getting:Converted UTF est un r ê ve en noir
The expected output should beConverted UTF est un rêve en noir
When I checked the UTF table and used a utf converter online the output is correctly got with the same code and using the string @"est un r\u00EAve en noir"
but I think that the UTF16 Representation if I am not mistaken. Now am really in a fix how to parse this feed.
Solution 1:
After much grief I have figured this out. Use this.
NSString *correctString = [NSString stringWithCString:[utf8String cStringUsingEncoding:NSISOLatin1StringEncoding] encoding:NSUTF8StringEncoding];
Solution 2:
I think the test case is broken; the following:
NSString* str1 = @"est un r \u00c3\u00aa ve en noir";
NSLog(@"%@", str1);
Also outputs 'est un r ê ve en noir'. However, this:
NSString* str1 = @"est un rêve en noir";
NSLog(@"%@", str1);
Outputs 'est un rêve en noir', as does:
NSString* str1 = @"est un rêve en noir";
NSString* str = [NSString stringWithUTF8String:[str1 cStringUsingEncoding:NSUTF8StringEncoding]];
NSLog(@"%@", str);
And ditto for the slightly shorter version:
NSString* str1 = @"est un rêve en noir";
NSString* str = [NSString stringWithUTF8String:[str1 UTF8Encoding]];
NSLog(@"%@", str);
And, indeed:
char *str1 = "est un r\xc3\xaave en noir";
NSString* str = [NSString stringWithUTF8String:str1];
NSLog(@"%@", str);
I think it's a question of JSON, not UTF8. The \u followed by four hexadecimal digits is JSON's way of encoding a generic UTF16 character, it's not an inherent part of UTF. So NSString doesn't know how to deal with it. Your JSON parser needs to be adapted to parse escape sequences properly.