--- [ e-on software research ] ---

1. Alignment theory

a. Definition

b. How processor fetch memory

c. Data structure padding

2. C++ examples

a. What the c++ specification says

b. GCC and Visual c++ x86 / x86-64 implementation

c. Benchmarks

d. Controlling alignment and padding

e. Common data type size and alignment

3. References

1. Alignment theory

This post is a refactoring of what you can find over the web. Used sources can be found in references section.

a. Definition

The alignment of a given variable is the largest power-of-2 value, where the address of the variable, modulo this power-of-two value is 0, that is :

address modulo alignment = 0

We will call this variable alignment-byte aligned.

Note

– Different types can have different alignment requirement
– If x > y, and both x and y are power-of-two values, a variable that is x-byte aligned is also y-byte aligned

Example

Address (bytes)	Alignment
0x00	infinite
0x01	1-byte
0x02	2-byte
0x03	1-byte
0x04	4-byte (so also 2-byte)
0x05	1-byte
0x06	2-byte
0x07	1-byte
0x08	8-byte (4 and 2)

b. How processor fetch memory

Aligned address

– Read the chunk and place it into the register

Unaligned address

– read the first chunk of the unaligned address
– shift out the "unwanted" bytes from the first chunk
– read the second chunk of the unaligned address
– shift out some of its information
– merged together the two chunks for placement in the register

Compared to only read a chunk, it's a lot of work !

schema

Some processors just aren't willing to do all of that work for you :
– exception (68000)
– nothing
– something wrong (Altivec, Itanium)

c. Data structure padding

Compilators add unnamed data members in structures :
– After members, to keep members aligned on their required alignment
– After the last member to keep structure aligned in arrays

Note

To keep these two constraints, a structure alignment requirement, is the stricter member alignment requirement.

Example

We take :
– char : 1-byte aligned and take 1 byte
– int : 2-byte aligned. And take 2 byte

struct S // must be 2–byte aligned
{
   char c1; // can be placed on any address
   int i;   // must be 2-byte aligned
   char c2; // can be placed on any address
};
S s[2]; // sizeof(s) == 10 bytes (2 bytes for padding)

Address	Variable
0x0	s[0].c1
0x1	unnamed member
0x2	s[0].i
0x3	s[0].i
0x4	S[0].c2
0x5	s[1].c1
0x6	unnamed member
0x7	s[1].i
0x8	s[1].i
0x9	S[1].c2

Tips

– We could have saved the 2 padding bytes by placing c2 just before i
– With power of two alignments ascending/descending (by size) declaration deliver an optimal size, but writing readable code should be your primary goal

2. C++ examples

a. What the c++ specification says

The C+ + memory model [intro.memory] (1.7 § 1)

The fundamental storage unit in the C + + memory model is the byte. A byte […] is composed of a contiguous sequence of bits, the number of which is implementation-defined.

Types (3.9 §5)

[…] The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.

Sizeof (5.3.3 §2)

When applied to a class, the result is the number of bytes in an object of that class including any padding required for placing objects of that type in an array.

b. GCC and Visual c++ x86/x86-64 implementation

For performance reason, all types are aligned on their natural lengths, except items that are greater than 8 bytes in length,. It is recommended that all structures larger than 16 bytes align on 16-byte boundaries.

In general, for the best performance, align data as follows:
– align 8-bit data at any address
– align 16-bit data to be contained within an aligned four-byte word
– align 32-bit data so that its base address is a multiple of four
– align 64-bit data so that its base address is a multiple of eight
– align 80-bit data so that its base address is a multiple of sixteen
– align 128-bit data so that its base address is a multiple of sixteen

SSE2 instructions on x86 CPUs do require the data to be 128-bit (16-byte) aligned and there can be substantial performance advantages from using aligned data on these architectures.

c. Benchmarks

9 000 000 iterations double copy from source[i] to dest[i].

unaligned / aligned access time ratio :
– pentium III (731 MHz) : 3.25 times slower
– pentium IV (2.53 GHz) : 2 times slower
– itanium2 (900 MHz) : 459 times slower

d. Controlling alignment and padding

visual 2008

#pragma pack(4) // 4-byte aligned
struct S
{
   char c;   // 1-byte aligned
   double d; // 4-byte aligned instead of 8-bytes aligned
             // causes warning C4121
};
#pragma pack() // reset to default

This pragma directive permits to have a maximum alignment of N-byte.

gcc 4

GCC understands pragma pack as visual. But can use more accurate syntaxes :

struct foo
{
   int x[2] __attribute__ ((aligned (8))); // minimum 8-byte aligned
};

struc foo
{
   char a;
   int x[2] __attribute__ ((packed)); // pack this member behind a
};

e. Common data type size and alignment

VISUAL C++ / GCC (WIN32 )

type	size (bytes)	alignment(byte)
void *	4	4
bool	1	1
char	1	1
short	2	2
int	4	4
long	4	4
float	4	4
double	8	8

VISUAL C++ (WIN64)

type	size (bytes)	alignment(byte)
void *	8	8
bool	1	1
char	1	1
short	2	2
int	4	4
long	4	4
float	4	4
double	8	8

MAC OS 10.6 (32 bits)

type	size (bytes)	alignment(byte)
void *	4	4
bool	1	1
char	1	1
short	2	2
int	4	4
long	4	4
float	4	4
double	8	4

MAC OS 10.6 (64 bits)

type	size (bytes)	alignment(byte)
void *	8	8
bool	1	1
char	1	1
short	2	2
int	4	4
long	8	8
float	4	4
double	8	8

3. References

C++ specification

INTERNATIONAL STANDARD ISO/IEC 14882 Second edition 2003-10-15

WIKIPEDIA

http://en.wikipedia.org/wiki/Data_structure_alignment

IBM

http://www.ibm.com/developerworks/library/pa-dalign

Microsoft MSDN

http://msdn.microsoft.com/en-us/library/aa290049%28VS.71%29.aspx

Intel

http://software.intel.com/en-us/articles/data-alignment-when-migrating-to-64-bit-intel-architecture

GPU Oriented Programming

jeudi 30 septembre 2010

Memory alignment : theory and c++ examples

1. Alignment theory

a. Definition

b. How processor fetch memory

c. Data structure padding

2. C++ examples

a. What the c++ specification says

b. GCC and Visual c++ x86/x86-64 implementation

c. Benchmarks

d. Controlling alignment and padding

e. Common data type size and alignment

3. References

Aucun commentaire:

Enregistrer un commentaire