How to convert an object to a byte array in C#

I have a collection of objects that I need to write to a binary file.

I need the bytes in the file to be compact, so I can't use BinaryFormatter. BinaryFormatter throws in all sorts of info for deserialization needs.

If I try

byte[] myBytes = (byte[]) myObject 

I get a runtime exception.

I need this to be fast so I'd rather not be copying arrays of bytes around. I'd just like the cast byte[] myBytes = (byte[]) myObject to work!

OK just to be clear, I cannot have any metadata in the output file. Just the object bytes. Packed object-to-object. Based on answers received, it looks like I'll be writing low-level Buffer.BlockCopy code. Perhaps using unsafe code.


Solution 1:

To convert an object to a byte array:

// Convert an object to a byte array
public static byte[] ObjectToByteArray(Object obj)
{
    BinaryFormatter bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

You just need copy this function to your code and send to it the object that you need to convert to a byte array. If you need convert the byte array to an object again you can use the function below:

// Convert a byte array to an Object
public static Object ByteArrayToObject(byte[] arrBytes)
{
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(arrBytes, 0, arrBytes.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = binForm.Deserialize(memStream);
        return obj;
    }
}

You can use these functions with custom classes. You just need add the [Serializable] attribute in your class to enable serialization

Solution 2:

If you want the serialized data to be really compact, you can write serialization methods yourself. That way you will have a minimum of overhead.

Example:

public class MyClass {

   public int Id { get; set; }
   public string Name { get; set; }

   public byte[] Serialize() {
      using (MemoryStream m = new MemoryStream()) {
         using (BinaryWriter writer = new BinaryWriter(m)) {
            writer.Write(Id);
            writer.Write(Name);
         }
         return m.ToArray();
      }
   }

   public static MyClass Desserialize(byte[] data) {
      MyClass result = new MyClass();
      using (MemoryStream m = new MemoryStream(data)) {
         using (BinaryReader reader = new BinaryReader(m)) {
            result.Id = reader.ReadInt32();
            result.Name = reader.ReadString();
         }
      }
      return result;
   }

}

Solution 3:

Well a cast from myObject to byte[] is never going to work unless you've got an explicit conversion or if myObject is a byte[]. You need a serialization framework of some kind. There are plenty out there, including Protocol Buffers which is near and dear to me. It's pretty "lean and mean" in terms of both space and time.

You'll find that almost all serialization frameworks have significant restrictions on what you can serialize, however - Protocol Buffers more than some, due to being cross-platform.

If you can give more requirements, we can help you out more - but it's never going to be as simple as casting...

EDIT: Just to respond to this:

I need my binary file to contain the object's bytes. Only the bytes, no metadata whatsoever. Packed object-to-object. So I'll be implementing custom serialization.

Please bear in mind that the bytes in your objects are quite often references... so you'll need to work out what to do with them.

I suspect you'll find that designing and implementing your own custom serialization framework is harder than you imagine.

I would personally recommend that if you only need to do this for a few specific types, you don't bother trying to come up with a general serialization framework. Just implement an instance method and a static method in all the types you need:

public void WriteTo(Stream stream)
public static WhateverType ReadFrom(Stream stream)

One thing to bear in mind: everything becomes more tricky if you've got inheritance involved. Without inheritance, if you know what type you're starting with, you don't need to include any type information. Of course, there's also the matter of versioning - do you need to worry about backward and forward compatibility with different versions of your types?

Solution 4:

I took Crystalonics' answer and turned them into extension methods. I hope someone else will find them useful:

public static byte[] SerializeToByteArray(this object obj)
{
    if (obj == null)
    {
        return null;
    }
    var bf = new BinaryFormatter();
    using (var ms = new MemoryStream())
    {
        bf.Serialize(ms, obj);
        return ms.ToArray();
    }
}

public static T Deserialize<T>(this byte[] byteArray) where T : class
{
    if (byteArray == null)
    {
        return null;
    }
    using (var memStream = new MemoryStream())
    {
        var binForm = new BinaryFormatter();
        memStream.Write(byteArray, 0, byteArray.Length);
        memStream.Seek(0, SeekOrigin.Begin);
        var obj = (T)binForm.Deserialize(memStream);
        return obj;
    }
}

Solution 5:

You are really talking about serialization, which can take many forms. Since you want small and binary, protocol buffers may be a viable option - giving version tolerance and portability as well. Unlike BinaryFormatter, the protocol buffers wire format doesn't include all the type metadata; just very terse markers to identify data.

In .NET there are a few implementations; in particular

  • protobuf-net
  • dotnet-protobufs

I'd humbly argue that protobuf-net (which I wrote) allows more .NET-idiomatic usage with typical C# classes ("regular" protocol-buffers tends to demand code-generation); for example:

[ProtoContract]
public class Person {
   [ProtoMember(1)]
   public int Id {get;set;}
   [ProtoMember(2)]
   public string Name {get;set;}
}
....
Person person = new Person { Id = 123, Name = "abc" };
Serializer.Serialize(destStream, person);
...
Person anotherPerson = Serializer.Deserialize<Person>(sourceStream);