How do I create an MD5 Hash of a string in Cocoa?

I know SHA-1 is preferred, but this project requires I use MD5.

#include <openssl/md5.h>

- (NSString*) MD5Hasher: (NSString*) query {
    NSData* hashed = [query dataUsingEncoding:NSUTF8StringEncoding];
    unsigned char *digest = MD5([hashed bytes], [hashed length], NULL);
    NSString *final = [NSString stringWithUTF8String: (char *)digest];
    return final;
}

I got this code from an answer to another similar question on StackOverflow, but I get the following error from GDB when the program breaks at return final;

(gdb) p digest
$1 = (unsigned char *) 0xa06310e4 "\0206b\260/\336\316^\021\b\a/9\310\225\204"
(gdb) po final
Cannot access memory at address 0x0
(gdb) po digest

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xb0623630
0x98531ed7 in objc_msgSend ()
The program being debugged was signaled while in a function called from GDB.
GDB has restored the context to what it was before the call.
To change this behavior use "set unwindonsignal off"
Evaluation of the expression containing the function
(_NSPrintForDebugger) will be abandoned.

I can't make any sense of it.


This is the category I use:

NSString+MD5.h

@interface NSString (MD5)

- (NSString *)MD5String;

@end

NSString+MD5.m

#import <CommonCrypto/CommonDigest.h>

@implementation NSString (MD5)

- (NSString *)MD5String {
    const char *cStr = [self UTF8String];
    unsigned char result[CC_MD5_DIGEST_LENGTH];
    CC_MD5( cStr, (CC_LONG)strlen(cStr), result );

    return [NSString stringWithFormat:
        @"%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X%02X",
        result[0], result[1], result[2], result[3], 
        result[4], result[5], result[6], result[7],
        result[8], result[9], result[10], result[11],
        result[12], result[13], result[14], result[15]
    ];  
}

@end

Usage

NSString *myString = @"test";
NSString *md5 = [myString MD5String]; // returns NSString of the MD5 of test

cdespinosa and irsk have already shown you your actual problem, so let me go through your GDB transcript:

(gdb) p digest
$1 = (unsigned char *) 0xa06310e4 "\0206b\260/\336\316^\021\b\a/9\310\225\204"

You've printed digest as a C string. You can see here that this string is raw bytes; hence all the octal escapes (e.g., \020, \225) and the couple of punctuation characters (/ and ^). It is not the printable ASCII hexadecimal representation you were expecting. You're lucky that there were no zero bytes in it; otherwise, you would not have printed the entire hash.

(gdb) po final
Cannot access memory at address 0x0

final is nil. This makes sense, as your string above isn't valid UTF-8; again, it's just raw data bytes. stringWithUTF8String: requires a UTF-8-encoded text string; you didn't give it one, so it returned nil.

For passing raw data around, you'd use NSData. In this case, I think you want the hex representation, so you'll need to create that yourself the way irsk showed you.

Finally, consider how lucky you are that your input didn't hash to a valid UTF-8 string. If it had, you wouldn't have noticed this problem. You may want to construct a unit test for this hash method with this input.

(gdb) po digest

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xb0623630
0x98531ed7 in objc_msgSend ()

Your program crashed (specific problem: “bad access”, “invalid address”) in objc_msgSend. This is because digest either is not a Cocoa/CF object at all or was one but was freed. In this case, it's because digest is not a Cocoa object; it is a C array of bytes, as shown by your p digest line above.

Remember, Objective-C is a superset of C. All of C exists unchanged in it. That means there are C arrays (e.g., char []) and Cocoa's NSArrays side by side. Moreover, since NSArray comes from the Cocoa framework, not the Objective-C language, there's no way to make NSArray objects interchangeable with C arrays: You can't use the subscript operator on Cocoa arrays, and you can't send Objective-C messages to C arrays.


Facebook uses this

#import <CommonCrypto/CommonDigest.h>

+ (NSString*)md5HexDigest:(NSString*)input {
    const char* str = [input UTF8String];
    unsigned char result[CC_MD5_DIGEST_LENGTH];
    CC_MD5(str, strlen(str), result);

    NSMutableString *ret = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH*2];
    for(int i = 0; i<CC_MD5_DIGEST_LENGTH; i++) {
        [ret appendFormat:@"%02x",result[i]];
    }
    return ret;
}

Or instance method

- (NSString *)md5 {
    const char* str = [self UTF8String];
    unsigned char result[CC_MD5_DIGEST_LENGTH];
    CC_MD5(str, (CC_LONG)strlen(str), result);

    NSMutableString *ret = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH*2];
    for(int i = 0; i<CC_MD5_DIGEST_LENGTH; i++) {
        [ret appendFormat:@"%02x",result[i]];
    }
    return ret;
}

I had used this method:

NSString+MD5.h

@interface NSString (MD5)

- (NSString *)MD5;

@end

NSString+MD5.m

#import "NSString+MD5.h"
#import <CommonCrypto/CommonDigest.h>

@implementation NSString (MD5)

- (NSString *)MD5 {

    const char * pointer = self.UTF8String;
    unsigned char md5Buffer[CC_MD5_DIGEST_LENGTH];

    CC_MD5(pointer, (CC_LONG)strlen(pointer), md5Buffer);

    NSMutableString * string = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH * 2];
    for (int i = 0; i < CC_MD5_DIGEST_LENGTH; i++) {
        [string appendFormat:@"%02x", md5Buffer[i]];
    }

    return string;
}

@end

Usage:

NSString * myString = @"test";
NSString * md5 = [myString MD5];

I believe that digest is a pointer to a raw binary hash. In the next line you're attempting to interpret it as a UTF-8 string, but it is most likely not to contain legal UTF-8-encoded character sequences.

I expect what you want is to convert the 16-byte static array of unsigned char into 32 ASCII hexadecimal characters [0-9a-f] using whatever algorithm you see fit.