PHP best way to MD5 multi-dimensional array?
What is the best way to generate an MD5 (or any other hash) of a multi-dimensional array?
I could easily write a loop which would traverse through each level of the array, concatenating each value into a string, and simply performing the MD5 on the string.
However, this seems cumbersome at best and I wondered if there was a funky function which would take a multi-dimensional array, and hash it.
Solution 1:
(Copy-n-paste-able function at the bottom)
As mentioned prior, the following will work.
md5(serialize($array));
However, it's worth noting that (ironically) json_encode performs noticeably faster:
md5(json_encode($array));
In fact, the speed increase is two-fold here as (1) json_encode alone performs faster than serialize, and (2) json_encode produces a smaller string and therefore less for md5 to handle.
Edit: Here is evidence to support this claim:
<?php //this is the array I'm using -- it's multidimensional.
$array = unserialize('a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:4:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}i:3;a:6:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}i:5;a:5:{i:0;a:0:{}i:1;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:3:{i:0;a:0:{}i:1;a:0:{}i:2;a:0:{}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}}i:2;s:5:"hello";i:3;a:2:{i:0;a:0:{}i:1;a:0:{}}i:4;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:1:{i:0;a:0:{}}}}}}}}}');
//The serialize test
$b4_s = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(serialize($array));
}
echo 'serialize() w/ md5() took: '.($sTime = microtime(1)-$b4_s).' sec<br/>';
//The json test
$b4_j = microtime(1);
for ($i=0;$i<10000;$i++) {
$serial = md5(json_encode($array));
}
echo 'json_encode() w/ md5() took: '.($jTime = microtime(1)-$b4_j).' sec<br/><br/>';
echo 'json_encode is <strong>'.( round(($sTime/$jTime)*100,1) ).'%</strong> faster with a difference of <strong>'.($sTime-$jTime).' seconds</strong>';
JSON_ENCODE is consistently over 250% (2.5x) faster (often over 300%) -- this is not a trivial difference. You may see the results of the test with this live script here:
- http://nathanbrauer.com/playground/serialize-vs-json.php
- http://nathanbrauer.com/playground/plain-text/serialize-vs-json.php
Now, one thing to note is array(1,2,3) will produce a different MD5 as array(3,2,1). If this is NOT what you want. Try the following code:
//Optionally make a copy of the array (if you want to preserve the original order)
$original = $array;
array_multisort($array);
$hash = md5(json_encode($array));
Edit: There's been some question as to whether reversing the order would produce the same results. So, I've done that (correctly) here:
- http://nathanbrauer.com/playground/json-vs-serialize.php
- http://nathanbrauer.com/playground/plain-text/json-vs-serialize.php
As you can see, the results are exactly the same. Here's the (corrected) test originally created by someone related to Drupal:
- http://nathanjbrauer.com/playground/drupal-calculation.php
- http://nathanjbrauer.com/playground/plain-text/drupal-calculation.php
And for good measure, here's a function/method you can copy and paste (tested in 5.3.3-1ubuntu9.5):
function array_md5(Array $array) {
//since we're inside a function (which uses a copied array, not
//a referenced array), you shouldn't need to copy the array
array_multisort($array);
return md5(json_encode($array));
}
Solution 2:
md5(serialize($array));