C# performance - Using unsafe pointers instead of IntPtr and Marshal

Solution 1:

It's a little old thread, but I recently made excessive performance tests with marshaling in C#. I need to unmarshal lots of data from a serial port over many days. It was important to me to have no memory leaks (because the smallest leak will get significant after a couple of million calls) and I also made a lot of statistical performance (time used) tests with very big structs (>10kb) just for the sake of it (an no, you should never have a 10kb struct :-) )

I tested the following three unmarshalling strategies (I also tested the marshalling). In nearly all cases the first one (MarshalMatters) outperformed the other two. Marshal.Copy was always slowest by far, the other two were mostly very close together in the race.

Using unsafe code can pose a significant security risk.

First:

public class MarshalMatters
{
    public static T ReadUsingMarshalUnsafe<T>(byte[] data) where T : struct
    {
        unsafe
        {
            fixed (byte* p = &data[0])
            {
                return (T)Marshal.PtrToStructure(new IntPtr(p), typeof(T));
            }
        }
    }

    public unsafe static byte[] WriteUsingMarshalUnsafe<selectedT>(selectedT structure) where selectedT : struct
    {
        byte[] byteArray = new byte[Marshal.SizeOf(structure)];
        fixed (byte* byteArrayPtr = byteArray)
        {
            Marshal.StructureToPtr(structure, (IntPtr)byteArrayPtr, true);
        }
        return byteArray;
    }
}

Second:

public class Adam_Robinson
{

    private static T BytesToStruct<T>(byte[] rawData) where T : struct
    {
        T result = default(T);
        GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);
        try
        {
            IntPtr rawDataPtr = handle.AddrOfPinnedObject();
            result = (T)Marshal.PtrToStructure(rawDataPtr, typeof(T));
        }
        finally
        {
            handle.Free();
        }
        return result;
    }

    /// <summary>
    /// no Copy. no unsafe. Gets a GCHandle to the memory via Alloc
    /// </summary>
    /// <typeparam name="selectedT"></typeparam>
    /// <param name="structure"></param>
    /// <returns></returns>
    public static byte[] StructToBytes<T>(T structure) where T : struct
    {
        int size = Marshal.SizeOf(structure);
        byte[] rawData = new byte[size];
        GCHandle handle = GCHandle.Alloc(rawData, GCHandleType.Pinned);
        try
        {
            IntPtr rawDataPtr = handle.AddrOfPinnedObject();
            Marshal.StructureToPtr(structure, rawDataPtr, false);
        }
        finally
        {
            handle.Free();
        }
        return rawData;
    }
}

Third:

/// <summary>
/// http://stackoverflow.com/questions/2623761/marshal-ptrtostructure-and-back-again-and-generic-solution-for-endianness-swap
/// </summary>
public class DanB
{
    /// <summary>
    /// uses Marshal.Copy! Not run in unsafe. Uses AllocHGlobal to get new memory and copies.
    /// </summary>
    public static byte[] GetBytes<T>(T structure) where T : struct
    {
        var size = Marshal.SizeOf(structure); //or Marshal.SizeOf<selectedT>(); in .net 4.5.1
        byte[] rawData = new byte[size];
        IntPtr ptr = Marshal.AllocHGlobal(size);

        Marshal.StructureToPtr(structure, ptr, true);
        Marshal.Copy(ptr, rawData, 0, size);
        Marshal.FreeHGlobal(ptr);
        return rawData;
    }

    public static T FromBytes<T>(byte[] bytes) where T : struct
    {
        var structure = new T();
        int size = Marshal.SizeOf(structure);  //or Marshal.SizeOf<selectedT>(); in .net 4.5.1
        IntPtr ptr = Marshal.AllocHGlobal(size);

        Marshal.Copy(bytes, 0, ptr, size);

        structure = (T)Marshal.PtrToStructure(ptr, structure.GetType());
        Marshal.FreeHGlobal(ptr);

        return structure;
    }
}

Solution 2:

Considerations in Interoperability explains why and when Marshaling is required and at what cost. Quote:

  1. Marshaling occurs when a caller and a callee cannot operate on the same instance of data.
  2. repeated marshaling can negatively affect the performance of your application.

Therefore, answering your question if

... using pointers for P/Invoking really faster than using marshaling ...

first ask yourself a question if the managed code is able to operate on the unmanaged method return value instance. If the answer is yes then Marshaling and the associated performance cost is not required. The approximate time saving would be O(n) function where n of the size of the marshalled instance. In addition, not keeping both managed and unmanaged blocks of data in memory at the same time for the duration of the method (in "IntPtr and Marshal" example) eliminates additional overhead and the memory pressure.

What are the drawbacks of using unsafe code and pointers ...

The drawback is the risk associated with accessing the memory directly through pointers. There is nothing less safe to it than using pointers in C or C++. Use it if needed and makes sense. More details are here.

There is one "safety" concern with the presented examples: releasing of allocated unmanaged memory is not guaranteed after the managed code errors. The best practice is to

CreateMyData(out myData1, length);

if(myData1!=IntPtr.Zero) {
    try {
        // -> use myData1
        ...
        // <-
    }
    finally {
        DestroyMyData(myData1);
    }
}

Solution 3:

For anyone still reading,

Something I don't think I saw in any of the answers, - unsafe code does present something of a security risk. It's not a huge risk, it would be something quite challenging to exploit. However, if like me you work in a PCI compliant organization, unsafe code is disallowed by policy for this reason.

Managed code is normally very secure because the CLR takes care of memory location and allocation, preventing you from accessing or writing any memory you're not supposed to.

When you use the unsafe keyword and compile with '/unsafe' and use pointers, you bypass these checks and create the potential for someone to use your application to gain some level of unauthorized access to the machine it is running on. Using something like a buffer-overrun attack, your code could be tricked into writing instructions into an area of memory that might then be accessed by the program counter (i.e. code injection), or just crash the machine.

Many years ago, SQL server actually fell prey to malicious code delivered in a TDS packet that was far longer than it was supposed to be. The method reading the packet didn't check the length and continued to write the contents past the reserved address space. The extra length and content were carefully crafted such that it wrote an entire program into memory - at the address of the next method. The attacker then had their own code being executed by the SQL server within a context that had the highest level of access. It didn't even need to break the encryption as the vulnerability was below this point in the transport layer stack.