round trip Swift number types to/from Data

With Swift 3 leaning towards Data instead of [UInt8], I'm trying to ferret out what the most efficient/idiomatic way to encode/decode swifts various number types (UInt8, Double, Float, Int64, etc) as Data objects.

There's this answer for using [UInt8], but it seems to be using various pointer APIs that I can't find on Data.

I'd like to basically some custom extensions that look something like:

let input = 42.13 // implicit Double
let bytes = input.data
let roundtrip = bytes.to(Double) // --> 42.13

The part that really eludes me, I've looked through a bunch of the docs, is how I can get some sort of pointer thing (OpaquePointer or BufferPointer or UnsafePointer?) from any basic struct (which all of the numbers are). In C, I would just slap an ampersand in front of it, and there ya go.


Note: The code has been updated for Swift 5 (Xcode 10.2) now. (Swift 3 and Swift 4.2 versions can be found in the edit history.) Also possibly unaligned data is now correctly handled.

How to create Data from a value

As of Swift 4.2, data can be created from a value simply with

let value = 42.13
let data = withUnsafeBytes(of: value) { Data($0) }

print(data as NSData) // <713d0ad7 a3104540>

Explanation:

  • withUnsafeBytes(of: value) invokes the closure with a buffer pointer covering the raw bytes of the value.
  • A raw buffer pointer is a sequence of bytes, therefore Data($0) can be used to create the data.

How to retrieve a value from Data

As of Swift 5, the withUnsafeBytes(_:) of Data invokes the closure with an “untyped” UnsafeMutableRawBufferPointer to the bytes. The load(fromByteOffset:as:) method the reads the value from the memory:

let data = Data([0x71, 0x3d, 0x0a, 0xd7, 0xa3, 0x10, 0x45, 0x40])
let value = data.withUnsafeBytes {
    $0.load(as: Double.self)
}
print(value) // 42.13

There is one problem with this approach: It requires that the memory is property aligned for the type (here: aligned to a 8-byte address). But that is not guaranteed, e.g. if the data was obtained as a slice of another Data value.

It is therefore safer to copy the bytes to the value:

let data = Data([0x71, 0x3d, 0x0a, 0xd7, 0xa3, 0x10, 0x45, 0x40])
var value = 0.0
let bytesCopied = withUnsafeMutableBytes(of: &value, { data.copyBytes(to: $0)} )
assert(bytesCopied == MemoryLayout.size(ofValue: value))
print(value) // 42.13

Explanation:

  • withUnsafeMutableBytes(of:_:) invokes the closure with a mutable buffer pointer covering the raw bytes of the value.
  • The copyBytes(to:) method of DataProtocol (to which Data conforms) copies bytes from the data to that buffer.

The return value of copyBytes() is the number of bytes copied. It is equal to the size of the destination buffer, or less if the data does not contain enough bytes.

Generic solution #1

The above conversions can now easily be implemented as generic methods of struct Data:

extension Data {

    init<T>(from value: T) {
        self = Swift.withUnsafeBytes(of: value) { Data($0) }
    }

    func to<T>(type: T.Type) -> T? where T: ExpressibleByIntegerLiteral {
        var value: T = 0
        guard count >= MemoryLayout.size(ofValue: value) else { return nil }
        _ = Swift.withUnsafeMutableBytes(of: &value, { copyBytes(to: $0)} )
        return value
    }
}

The constraint T: ExpressibleByIntegerLiteral is added here so that we can easily initialize the value to “zero” – that is not really a restriction because this method can be used with “trival” (integer and floating point) types anyway, see below.

Example:

let value = 42.13 // implicit Double
let data = Data(from: value)
print(data as NSData) // <713d0ad7 a3104540>

if let roundtrip = data.to(type: Double.self) {
    print(roundtrip) // 42.13
} else {
    print("not enough data")
}

Similarly, you can convert arrays to Data and back:

extension Data {

    init<T>(fromArray values: [T]) {
        self = values.withUnsafeBytes { Data($0) }
    }

    func toArray<T>(type: T.Type) -> [T] where T: ExpressibleByIntegerLiteral {
        var array = Array<T>(repeating: 0, count: self.count/MemoryLayout<T>.stride)
        _ = array.withUnsafeMutableBytes { copyBytes(to: $0) }
        return array
    }
}

Example:

let value: [Int16] = [1, Int16.max, Int16.min]
let data = Data(fromArray: value)
print(data as NSData) // <0100ff7f 0080>

let roundtrip = data.toArray(type: Int16.self)
print(roundtrip) // [1, 32767, -32768]

Generic solution #2

The above approach has one disadvantage: It actually works only with "trivial" types like integers and floating point types. "Complex" types like Array and String have (hidden) pointers to the underlying storage and cannot be passed around by just copying the struct itself. It also would not work with reference types which are just pointers to the real object storage.

So solve that problem, one can

  • Define a protocol which defines the methods for converting to Data and back:

    protocol DataConvertible {
        init?(data: Data)
        var data: Data { get }
    }
    
  • Implement the conversions as default methods in a protocol extension:

    extension DataConvertible where Self: ExpressibleByIntegerLiteral{
    
        init?(data: Data) {
            var value: Self = 0
            guard data.count == MemoryLayout.size(ofValue: value) else { return nil }
            _ = withUnsafeMutableBytes(of: &value, { data.copyBytes(to: $0)} )
            self = value
        }
    
        var data: Data {
            return withUnsafeBytes(of: self) { Data($0) }
        }
    }
    

    I have chosen a failable initializer here which checks that the number of bytes provided matches the size of the type.

  • And finally declare conformance to all types which can safely be converted to Data and back:

    extension Int : DataConvertible { }
    extension Float : DataConvertible { }
    extension Double : DataConvertible { }
    // add more types here ...
    

This makes the conversion even more elegant:

let value = 42.13
let data = value.data
print(data as NSData) // <713d0ad7 a3104540>

if let roundtrip = Double(data: data) {
    print(roundtrip) // 42.13
}

The advantage of the second approach is that you cannot inadvertently do unsafe conversions. The disadvantage is that you have to list all "safe" types explicitly.

You could also implement the protocol for other types which require a non-trivial conversion, such as:

extension String: DataConvertible {
    init?(data: Data) {
        self.init(data: data, encoding: .utf8)
    }
    var data: Data {
        // Note: a conversion to UTF-8 cannot fail.
        return Data(self.utf8)
    }
}

or implement the conversion methods in your own types to do whatever is necessary so serialize and deserialize a value.

Byte order

No byte order conversion is done in the above methods, the data is always in the host byte order. For a platform independent representation (e.g. “big endian” aka “network” byte order), use the corresponding integer properties resp. initializers. For example:

let value = 1000
let data = value.bigEndian.data
print(data as NSData) // <00000000 000003e8>

if let roundtrip = Int(data: data) {
    print(Int(bigEndian: roundtrip)) // 1000
}

Of course this conversion can also be done generally, in the generic conversion method.


You can get an unsafe pointer to mutable objects by using withUnsafePointer:

withUnsafePointer(&input) { /* $0 is your pointer */ }

I don't know of a way to get one for immutable objects, because the inout operator only works on mutable objects.

This is demonstrated in the answer that you've linked to.


In my case, Martin R's answer helped but the result was inverted. So I did a small change in his code:

extension UInt16 : DataConvertible {

    init?(data: Data) {
        guard data.count == MemoryLayout<UInt16>.size else { 
          return nil 
        }
    self = data.withUnsafeBytes { $0.pointee }
    }

    var data: Data {
         var value = CFSwapInt16HostToBig(self)//Acho que o padrao do IOS 'e LittleEndian, pois os bytes estavao ao contrario
         return Data(buffer: UnsafeBufferPointer(start: &value, count: 1))
    }
}

The problem is related with LittleEndian and BigEndian.