Friday, October 13, 2017

Memory alignment in C++ and C# and probably in every other language that can integrate with C++

I've learned something new today. It all starts with an innocuous question: Given the following struct, tell me what is its size:
public struct MyStruct
    {
        public int i1;
        public char c1;
        public long l1;
        public char c2;
        public short s1;
        public char c3;
    }
Let's assume that this is in 32bit C++ or C#.

The first answer is 4+1+8+1+2+1 = 17. Nope! It's 24.

Well, it is called memory alignment and it has to do with the way CPUs work. They have memory registers of fixed size, various caches with different sizes and speeds, etc. Basically, when you ask for a 4 byte int, it needs to be "aligned" so that you get 4 bytes from the correct position into a single register. Otherwise the CPU needs to take two registers (let's say 1 byte in one and 3 bytes in another) then mask and shift both and add them into another register. That is unbelievably expensive at that level.

So, why 24? i1 is an int, it needs to be aligned on positions that are multiple of 4 bytes. 0 qualifies, so it takes 4 bytes. Then there is a char. Chars are one byte, can be put anywhere, so the size becomes 5 bytes. However, a long is 8 bytes, so it needs to be on a position that is a multiple of 8. That is why we add 3 bytes as padding, then we add the long in. Now the size is 16. One more char → 17. Shorts are 2 bytes, so we add one more padding byte to get to 18, then the short is added. The size is 20. And in the end you get the last char in, getting to 21. But now, the struct needs to be aligned with itself, meaning with the largest primitive used inside it, in our case the long with 8 bytes. That is why we add 3 more bytes so that the struct has a size that is a multiple of 8.

Can we do something about it? What if I want to spend speed on memory or disk space? We can use directives such as StructLayout. It receives a LayoutKind - which defaults to Sequential, but can also be Auto or Explicit - and a numeric Pack parameter. Auto rearranges the order of the members of the class, so it takes the least amount of space. However, this has some side effects, like getting errors when you want to use Marshal.SizeOf. With Explicit, each field needs to be adorned with a FieldOffset attribute to determine the exact position in memory; that also means you can use several fields on the same position, like in:
[StructLayout(LayoutKind.Explicit)]
    public struct MyStruct
    {
        [FieldOffset(0)]
        public int i1;
        [FieldOffset(4)]
        public int i2;
        [FieldOffset(0)]
        public long l1;
    }
The Pack parameter tells the system on how to align the fields. 0 is the default, but 1 will make the size of the first struct above to actually be 17.
[StructLayout(LayoutKind.Sequential, Pack = 1)]
    public struct MyStruct
    {
        public int i1;
        public char c1;
        public long l1;
        public char c2;
        public short s1;
        public char c3;
    }
Other values can be 2,4,8,16,32,64 or 128. You can test on how the performance is affected by this, as an exercise.

More information here: Advanced c# programming 6: Everything about memory allocation in .NET

0 comments: