What is the correct fast way to compare two `Uint8List`s' equality?
Solution 1:
I think that you should be able to get some slight speed-up by comparing bytes 4 or 8 bytes at a time (which also should have alignment benefits). This should not need to copy the byte data and therefore would not have a significant memory penalty. I wrote a quick implementation to try it:
import 'dart:typed_data';
import 'dart:math' show Random;
/// Naive [List] equality implementation.
bool listEquals<E>(List<E> list1, List<E> list2) {
if (identical(list1, list2)) {
return true;
}
if (list1.length != list2.length) {
return false;
}
for (var i = 0; i < list1.length; i += 1) {
if (list1[i] != list2[i]) {
return false;
}
}
return true;
}
/// Compares two [Uint8List]s by comparing 8 bytes at a time.
bool memEquals(Uint8List bytes1, Uint8List bytes2) {
if (identical(bytes1, bytes2)) {
return true;
}
if (bytes1.lengthInBytes != bytes2.lengthInBytes) {
return false;
}
// Treat the original byte lists as lists of 8-byte words.
var numWords = bytes1.lengthInBytes ~/ 8;
var words1 = bytes1.buffer.asUint64List(0, numWords);
var words2 = bytes2.buffer.asUint64List(0, numWords);
for (var i = 0; i < words1.length; i += 1) {
if (words1[i] != words2[i]) {
return false;
}
}
// Compare any remaining bytes.
for (var i = words1.lengthInBytes; i < bytes1.lengthInBytes; i += 1) {
if (bytes1[i] != bytes2[i]) {
return false;
}
}
return true;
}
void main() {
var random = Random();
// Generate random data.
//
// 100 MB minus a few bytes to avoid being an exact multiple of 8 bytes.
const numBytes = 100 * 1000 * 1000 - 3;
var data = Uint8List.fromList([
for (var i = 0; i < numBytes; i += 1) random.nextInt(256),
]);
var dataCopy = Uint8List.fromList(data);
var stopwatch = Stopwatch()..start();
var result = listEquals(data, dataCopy);
print('Naive: $result ${stopwatch.elapsed}');
stopwatch
..reset()
..start();
result = memEquals(data, dataCopy);
print('memEquals: $result ${stopwatch.elapsed}');
My empirical results from running it as a Dart console application on my 64-bit Linux machine (dart mem_equals.dart
):
Naive: true 0:00:00.152984
memEquals: true 0:00:00.038664
and from compiling it (dart compile exe mem_equals.dart && mem_equals.exe
):
Naive: true 0:00:00.093478
memEquals: true 0:00:00.033560
I haven't compared with using dart:ffi
, but as a baseline, a pure C program calling memcmp
on an identically sized byte array (clang -O3 memcmp_test.c && a.out
) on the same system takes about 0.011s.
Solution 2:
Dart's listEquals
does exactly what your code does (plus a few shortcut checks) so I would use that instead of your own code, as it's cleaner. One possible alternative is to convert both lists to String
and do a String
equality comparison. I doubt that it is faster (because it creates a new String
), but your could easily benchmark that and decide!
To convert a UInt8List
to a String
use
String s = String.fromCharCodes(inputAsUint8List);