How do I represent a UUID in a protobuf message?

Solution 1:

You should probably use string or bytes to represent a UUID. Use string if it is most convenient to keep the UUID in human-readable format (e.g. "de305d54-75b4-431b-adb2-eb6b9e546014") or use bytes if you are storing the 128-bit value raw. (If you aren't sure, you probably want string.)

Wrapping the value in a message type called UUID can be helpful to make the code more self-documenting but will have some performance overhead and isn't strictly required. If you want to do this, define the type like:

message UUID {
  required string value = 1;
}

or:

message UUID {
  required bytes value = 1;
}

Solution 2:

If anything, you want to use string to avoid problems with endianness. Note that a UUID and a MS GUID that have the same string representation (and therefore are the same "id") have, however, different byte-stream order (big-endian vs little-endian). If you use bytes in the protocol to communicate between Java using UUID and C# using System.Guid, you could end up with flipped IDs.

Solution 3:

I don't have enough reputation points to make a comment, so I have to write this as an answer.

Use a string, not a byte array unlike what some other commenters are saying. According to MS (https://docs.microsoft.com/en-us/dotnet/architecture/grpc-for-wcf-developers/protobuf-data-types), "Don't use a bytes field for Guid values. Problems with endianness (Wikipedia definition) can result in erratic behavior when Protobuf is interacting with other platforms, such as Java."

Solution 4:

I would suggest to use string encoding not byte encoding if you want to ensure straight forward interoperability:

message UUID {
  required string value = 1;
}

The problem with the bytes encoding is: Different UUID libraries use different encoding/decoding schemes for bytes while they agree how to encode/decode strings.

For example see the C#'s System.guid.toBytesArray returns a mixed-endian format: the first three components are little-endian encoded while the last two are big-endian encoded.

In Java, the Apache Commons Library Uuid.toRawBytes returns the uuid in big-endian encoding:

"String": 35918bc9-196d-40ea-9779-889d79b753f0
"C#"    : C9 8B 91 35 6D 19 EA 40 97 79 88 9D 79 B7 53 F0
"Java"  : 35 91 8B C9 19 6D 40 EA 97 79 88 9D 79 B7 53 F0

As a side note: Python 3's Uuid provides both encodings: bytes for the big-endian encoding and bytes_le for the mixed-endian encoding.