How to tell GCC that a pointer argument is always double-word-aligned?

Solution 1:

If the attributes don't work, or aren't an option ....

I'm not sure, but try this:

void vecadd (float * restrict a, float * restrict b, float * restrict c)
{
   a = __builtin_assume_aligned (a, 8);
   b = __builtin_assume_aligned (b, 8);
   c = __builtin_assume_aligned (c, 8);

   for ....

That should tell GCC that the pointers are aligned. From that whether it does what you want depends on whether the compiler can use that information effectively; it might not be smart enough: these optimizations aren't easy.

Another option might be to wrap the float inside a union that must be 8-byte aligned:

typedef union {
  float f;
  long long dummy;
} aligned_float;

void vedadd (aligned_float * a, ......

I think that should enforce 8-byte alignment, but again, I don't know if the compiler is smart enough to use it.

Solution 2:

Following a piece of example code I've found on my system, I tried the following solution, which incorporate ideas from a few of the answers given earlier: basically, create a union of a small array of floats with a 64-bit type - in this case a SIMD vector of floats - and call the function with a cast of the operand float arrays:

typedef float f2 __attribute__((vector_size(8)));
typedef union { f2 v; float f[2]; } simdfu;

void vecadd(f2 * restrict a, f2 * restrict b, f2 * restrict c);

float a[16] __attribute__((aligned(8)));
float b[16] __attribute__((aligned(8)));
float c[16] __attribute__((aligned(8)));

int main()
{
    vecadd((f2 *) a, (f2 *) b, (f2 *) c);
    return 0;
}

Now the compiler does not generate the 4-aligned branch.

However, the __builtin_assume_aligned() would be the preferable solution, preventing the cast and possible side effects, if it only worked...

EDIT: I noticed that the builtin function is actually buggy on our implementation (i.e, not only it doesn't work, but it causes calculation errors later in the code.

Solution 3:

How to tell GCC that a pointer argument is always double-word-aligned?

It looks like newer versions of GCC have __builtin_assume_aligned:

Built-in Function: void * __builtin_assume_aligned (const void *exp, size_t align, ...)

This function returns its first argument, and allows the compiler to assume that the returned pointer is at least align bytes aligned. This built-in can have either two or three arguments, if it has three, the third argument should have integer type, and if it is nonzero means misalignment offset. For example:

void *x = __builtin_assume_aligned (arg, 16);

means that the compiler can assume x, set to arg, is at least 16-byte aligned, while:

void *x = __builtin_assume_aligned (arg, 32, 8);

means that the compiler can assume for x, set to arg, that (char *) x - 8 is 32-byte aligned.

Based on some other questions and answers on Stack Overflow circa 2010, it appears the built-in was not available in GCC 3 and early GCC 4. But I do not know where the cut-off point is.

Solution 4:

gcc versions have been dodgy about align() on simple typedefs and arrays. Typically to do what you want, you would have to wrap the float in a struct, and have the contained float have the alignment restriction.

With operator overloading you can almost make this painless, but it does assume you can use c++ syntax.

#include <stdio.h>
#include <string.h>

#define restrict __restrict__

typedef float oldfloat8 __attribute__ ((aligned(8)));

struct float8
{
    float f __attribute__ ((aligned(8)));

    float8 &operator=(float _f) { f = _f; return *this; }
    float8 &operator=(double _f) { f = _f; return *this; }
    float8 &operator=(int _f) { f = _f; return *this; }

    operator float() { return f; }
};

int Myfunc(float8 * restrict a, float8 * restrict b, float8 * restrict c);

int MyFunc(float8 * restrict a, float8 * restrict b, float8 * restrict c)
{
    return *c = *a* *b;
}

int main(int argc, char **argv)
{
    float8 a, b, c;

    float8 p[4];

    printf("sizeof(oldfloat8) == %d\n", (int)sizeof(oldfloat8));
    printf("sizeof(float8) == %d\n", (int)sizeof(float8));

    printf("addr p[0] == %p\n", &p[0] );
    printf("addr p[1] == %p\n", &p[1] );

    a = 2.0;
    b = 7.0;
    MyFunc( &a, &b, &c );
    return 0;
}

Solution 5:

Alignment specifications usually only work for alignments that are smaller than the base type of a pointer, not larger.

I think easiest is to declare your whole array with an alignment specification, something like

typedef float myvector[16];
typedef myvector alignedVector __attribute__((aligned (8));

(The syntax might not be correct, I always have difficulties to know where to put these __attribute__s)

And use that type throughout your code. For your function definition I'd try

void vecadd(alignedVector * restrict a, alignedVector * restrict b, alignedVector * restrict c);

This gives you an additional indirection but this is only syntax. Something like *a is just a noop and only reinterprets the pointer as a pointer to the first element.