How to create type safe enums?

To achieve type safety with enums in C is problematic, since they are essentially just integers. And enumeration constants are in fact defined to be of type int by the standard.

To achieve a bit of type safety I do tricks with pointers like this:

typedef enum
{
  BLUE,
  RED
} color_t;

void color_assign (color_t* var, color_t val) 
{ 
  *var = val; 
}

Because pointers have stricter type rules than values, so this prevents code such as this:

int x; 
color_assign(&x, BLUE); // compiler error

But it doesn't prevent code like this:

color_t color;
color_assign(&color, 123); // garbage value

This is because the enumeration constant is essentially just an int and can get implicitly assigned to an enumeration variable.

Is there a way to write such a function or macro color_assign, that can achieve complete type safety even for enumeration constants?


Solution 1:

It is possible to achieve this with a few tricks. Given

typedef enum
{
  BLUE,
  RED
} color_t;

Then define a dummy union which won't be used by the caller, but contains members with the same names as the enumeration constants:

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

This is possible because enumeration constants and member/variable names reside in different namespaces.

Then make some function-like macros:

#define c_assign(var, val) (var) = (typesafe_color_t){ .val = val }.val
#define color_assign(var, val) _Generic((var), color_t: c_assign(var, val))

These macros are then called like this:

color_t color;
color_assign(color, BLUE); 

Explanation:

  • The C11 _Generic keyword ensures that the enumeration variable is of the correct type. However, this can't be used on the enumeration constant BLUE because it is of type int.
  • Therefore the helper macro c_assign creates a temporary instance of the dummy union, where the designated initializer syntax is used to assign the value BLUE to a union member named BLUE. If no such member exists, the code won't compile.
  • The union member of the corresponding type is then copied into the enum variable.

We actually don't need the helper macro, I just split the expression for readability. It works just as fine to write

#define color_assign(var, val) _Generic((var), \
color_t: (var) = (typesafe_color_t){ .val = val }.val )

Examples:

color_t color; 
color_assign(color, BLUE);// ok
color_assign(color, RED); // ok

color_assign(color, 0);   // compiler error 

int x;
color_assign(x, BLUE);    // compiler error

typedef enum { foo } bar;
color_assign(color, foo); // compiler error
color_assign(bar, BLUE);  // compiler error

EDIT

Obviously the above doesn't prevent the caller from simply typing color = garbage;. If you wish to entirely block the possibility of using such assignment of the enum, you can put it in a struct and use the standard procedure of private encapsulation with "opaque type":

color.h

#include <stdlib.h>

typedef enum
{
  BLUE,
  RED
} color_t;

typedef union
{
  color_t BLUE;
  color_t RED;
} typesafe_color_t;

typedef struct col_t col_t; // opaque type

col_t* col_alloc (void);
void   col_free (col_t* col);

void col_assign (col_t* col, color_t color);

#define color_assign(var, val)   \
  _Generic( (var),               \
    col_t*: col_assign((var), (typesafe_color_t){ .val = val }.val) \
  )

color.c

#include "color.h"

struct col_t
{
  color_t color;
};

col_t* col_alloc (void) 
{ 
  return malloc(sizeof(col_t)); // (needs proper error handling)
}

void col_free (col_t* col)
{
  free(col);
}

void col_assign (col_t* col, color_t color)
{
  col->color = color;
}

main.c

col_t* color;
color = col_alloc();

color_assign(color, BLUE); 

col_free(color);

Solution 2:

The top answer's pretty good, but it has the downsides that it requires a lot of the C99 and C11 feature set in order to compile, and on top of that, it makes assignment pretty unnatural: You have to use a magic color_assign() function or macro in order to move data around instead of the standard = operator.

(Admittedly, the question explicitly asked about how to write color_assign(), but if you look at the question more broadly, it's really about how to change your code to get type-safety with some form of enumerated constants, and I'd consider not needing color_assign() in the first place to get type-safety to be fair game for the answer.)

Pointers are among the few shapes that C treats as type-safe, so they make a natural candidate for solving this problem. So I'd attack it this way: Rather than using an enum, I'd sacrifice a little memory to be able to have unique, predictable pointer values, and then use some really hokey funky #define statements to construct my "enum" (yes, I know macros pollute the macro namespace, but enum pollutes the compiler's global namespace, so I consider it close to an even trade):

color.h:

typedef struct color_struct_t *color_t;

struct color_struct_t { char dummy; };

extern struct color_struct_t color_dummy_array[];

#define UNIQUE_COLOR(value) \
    (&color_dummy_array[value])

#define RED    UNIQUE_COLOR(0)
#define GREEN  UNIQUE_COLOR(1)
#define BLUE   UNIQUE_COLOR(2)

enum { MAX_COLOR_VALUE = 2 };

This does, of course, require that you have just enough memory reserved somewhere to ensure nothing else can ever take on those pointer values:

color.c:

#include "color.h"

/* This never actually gets used, but we need to declare enough space in the
 * BSS so that the pointer values can be unique and not accidentally reused
 * by anything else. */
struct color_struct_t color_dummy_array[MAX_COLOR_VALUE + 1];

But from the consumer's perspective, this is all hidden: color_t is very nearly an opaque object. You can't assign anything to it other than valid color_t values and NULL:

user.c:

#include <stddef.h>
#include "color.h"

void foo(void)
{
    color_t color = RED;    /* OK */
    color_t color = GREEN;  /* OK */
    color_t color = NULL;   /* OK */
    color_t color = 27;     /* Error/warning */
}

This works well in most cases, but it does have the problem of not working in switch statements; you can't switch on a pointer (which is a shame). But if you're willing to add one more macro to make switching possible, you can arrive at something that's "good enough":

color.h:

...

#define COLOR_NUMBER(c) \
    ((c) - color_dummy_array)

user.c:

...

void bar(color_t c)
{
    switch (COLOR_NUMBER(c)) {
        case COLOR_NUMBER(RED):
            break;
        case COLOR_NUMBER(GREEN):
            break;
        case COLOR_NUMBER(BLUE):
            break;
    }
}

Is this a good solution? I wouldn't call it great, since it both wastes some memory and pollutes the macro namespace, and it doesn't let you use enum to automatically assign your color values, but it is another way to solve the problem that results in somewhat more natural usages, and unlike the top answer, it works all the way back to C89.