C# unsafe performance vs unmanaged PInvoke call

  c++, computer-vision, pinvoke, unmanaged

I am running an application which is working with bitmap images. Now I am looking for a fast way to swap the "Red" and "Blue" values of a "Format24bppRgb" bitmap image. In my C# code my first try was to use an unsafe code fragment:

                var bmpData = bmp.LockBits(new Rectangle(0, 0, bmp.Width, bmp.Height),
                    ImageLockMode.ReadWrite, bmp.PixelFormat);
                    byte* array = (byte*)bmpData.Scan0.ToPointer();
                    byte temp;
                    for (int x = 0; x < bmp.Width * bmp.Height * 3; x = x + 3)
                        temp = *(array + x + 2);
                        *(array + x + 2) = *(array + x);
                        *(array + x) = temp;

for my used bitmap sizes this takes around 50-70ms. Now I tried to do the work in an external library (based on C++) with a pinvoke call:

    public static extern IntPtr ChangeRB(IntPtr data, int width, int height);

    data = ChangeRB(bmpData.Scan0, bmp.Width, bmp.Height);

Which is defined as follows:

extern "C" __declspec(dllexport) void* ChangeRB(void* xArray, int xHeight, int xWidth);

void* ChangeRB(void* array, int height, int width)
    unsigned char* _array = (unsigned char*)array;
    char temp;
    for (int x = 0; x < height * width * 3; x = x + 3)
        temp = _array[x + 2];
        _array[x + 2] = _array[x];
        _array[x] = temp;
    return _array;

and this call takes around 1ms! So I cannot explain the huge performance difference here – or is it really the case that the unmanaged pinvoke is so much faster than the "unsafe" code fragment?

Source: Windows Questions C++