what is the purpose and return type of the __builtin_offsetof operator?

What is the purpose of __builtin_offsetof operator (or _FOFF operator in Symbian) in C++?

In addition what does it return? Pointer? Number of bytes?


Solution 1:

It's a builtin provided by the GCC compiler to implement the offsetof macro that is specified by the C and C++ Standard:

GCC - offsetof

It returns the offset in bytes that a member of a POD struct/union is at.

Sample:

struct abc1 { int a, b, c; };
union abc2 { int a, b, c; };
struct abc3 { abc3() { } int a, b, c; }; // non-POD
union abc4 { abc4() { } int a, b, c; };  // non-POD

assert(offsetof(abc1, a) == 0); // always, because there's no padding before a.
assert(offsetof(abc1, b) == 4); // here, on my system
assert(offsetof(abc2, a) == offsetof(abc2, b)); // (members overlap)
assert(offsetof(abc3, c) == 8); // undefined behavior. GCC outputs warnings
assert(offsetof(abc4, a) == 0); // undefined behavior. GCC outputs warnings

@Jonathan provides a nice example of where you can use it. I remember having seen it used to implement intrusive lists (lists whose data items include next and prev pointers itself), but i can't remember where it was helpful in implementing it, sadly.

Solution 2:

As @litb points out and @JesperE shows, offsetof() provides an integer offset in bytes (as a size_t value).

When might you use it?

One case where it might be relevant is a table-driven operation for reading an enormous number of diverse configuration parameters from a file and stuffing the values into an equally enormous data structure. Reducing enormous down to SO trivial (and ignoring a wide variety of necessary real-world practices, such as defining structure types in headers), I mean that some parameters could be integers and others strings, and the code might look faintly like:

#include <stddef.h>

typedef stuct config_info config_info;
struct config_info
{
   int parameter1;
   int parameter2;
   int parameter3;
   char *string1;
   char *string2;
   char *string3;
   int parameter4;
} main_configuration;

typedef struct config_desc config_desc;
static const struct config_desc
{
   char *name;
   enum paramtype { PT_INT, PT_STR } type;
   size_t offset;
   int   min_val;
   int   max_val;
   int   max_len;
} desc_configuration[] =
{
    { "GIZMOTRON_RATING", PT_INT, offsetof(config_info, parameter1), 0, 100, 0 },
    { "NECROSIS_FACTOR",  PT_INT, offsetof(config_info, parameter2), -20, +20, 0 },
    { "GILLYWEED_LEAVES", PT_INT, offsetof(config_info, parameter3), 1, 3, 0 },
    { "INFLATION_FACTOR", PT_INT, offsetof(config_info, parameter4), 1000, 10000, 0 },
    { "EXTRA_CONFIG",     PT_STR, offsetof(config_info, string1), 0, 0, 64 },
    { "USER_NAME",        PT_STR, offsetof(config_info, string2), 0, 0, 16 },
    { "GIZMOTRON_LABEL",  PT_STR, offsetof(config_info, string3), 0, 0, 32 },
};

You can now write a general function that reads lines from the config file, discarding comments and blank lines. It then isolates the parameter name, and looks that up in the desc_configuration table (which you might sort so that you can do a binary search - multiple SO questions address that). When it finds the correct config_desc record, it can pass the value it found and the config_desc entry to one of two routines - one for processing strings, the other for processing integers.

The key part of those functions is:

static int validate_set_int_config(const config_desc *desc, char *value)
{
    int *data = (int *)((char *)&main_configuration + desc->offset);
    ...
    *data = atoi(value);
    ...
}

static int validate_set_str_config(const config_desc *desc, char *value)
{
    char **data = (char **)((char *)&main_configuration + desc->offset);
    ...
    *data = strdup(value);
    ...
}

This avoids having to write a separate function for each separate member of the structure.