What is the correct fast way to compare two `Uint8List`s' equality?

Solution 1:

I think that you should be able to get some slight speed-up by comparing bytes 4 or 8 bytes at a time (which also should have alignment benefits). This should not need to copy the byte data and therefore would not have a significant memory penalty. I wrote a quick implementation to try it:

import 'dart:typed_data';
import 'dart:math' show Random;

/// Naive [List] equality implementation.
bool listEquals<E>(List<E> list1, List<E> list2) {
  if (identical(list1, list2)) {
    return true;
  }

  if (list1.length != list2.length) {
    return false;
  }

  for (var i = 0; i < list1.length; i += 1) {
    if (list1[i] != list2[i]) {
      return false;
    }
  }

  return true;
}

/// Compares two [Uint8List]s by comparing 8 bytes at a time.
bool memEquals(Uint8List bytes1, Uint8List bytes2) {
  if (identical(bytes1, bytes2)) {
    return true;
  }

  if (bytes1.lengthInBytes != bytes2.lengthInBytes) {
    return false;
  }

  // Treat the original byte lists as lists of 8-byte words.
  var numWords = bytes1.lengthInBytes ~/ 8;
  var words1 = bytes1.buffer.asUint64List(0, numWords);
  var words2 = bytes2.buffer.asUint64List(0, numWords);

  for (var i = 0; i < words1.length; i += 1) {
    if (words1[i] != words2[i]) {
      return false;
    }
  }

  // Compare any remaining bytes.
  for (var i = words1.lengthInBytes; i < bytes1.lengthInBytes; i += 1) {
    if (bytes1[i] != bytes2[i]) {
      return false;
    }
  }

  return true;
}

void main() {
  var random = Random();

  // Generate random data.
  //
  // 100 MB minus a few bytes to avoid being an exact multiple of 8 bytes.
  const numBytes = 100 * 1000 * 1000 - 3;
  var data = Uint8List.fromList([
    for (var i = 0; i < numBytes; i += 1) random.nextInt(256),
  ]);

  var dataCopy = Uint8List.fromList(data);

  var stopwatch = Stopwatch()..start();
  var result = listEquals(data, dataCopy);
  print('Naive:     $result ${stopwatch.elapsed}');

  stopwatch
    ..reset()
    ..start();
  result = memEquals(data, dataCopy);
  print('memEquals: $result ${stopwatch.elapsed}');

My empirical results from running it as a Dart console application on my 64-bit Linux machine (dart mem_equals.dart):

Naive:     true 0:00:00.152984
memEquals: true 0:00:00.038664

and from compiling it (dart compile exe mem_equals.dart && mem_equals.exe):

Naive:     true 0:00:00.093478
memEquals: true 0:00:00.033560

I haven't compared with using dart:ffi, but as a baseline, a pure C program calling memcmp on an identically sized byte array (clang -O3 memcmp_test.c && a.out) on the same system takes about 0.011s.

Solution 2:

Dart's listEquals does exactly what your code does (plus a few shortcut checks) so I would use that instead of your own code, as it's cleaner. One possible alternative is to convert both lists to String and do a String equality comparison. I doubt that it is faster (because it creates a new String), but your could easily benchmark that and decide!

To convert a UInt8List to a String use

String s = String.fromCharCodes(inputAsUint8List);