Opaque C structs: various ways to declare them
Solution 1:
My vote is for the third option that mouviciel posted then deleted:
I have seen a third way:
// foo.h struct foo; void doStuff(struct foo *f); // foo.c struct foo { int x; int y; };
If you really can't stand typing the struct
keyword, typedef struct foo foo;
(note: get rid of the useless and problematic underscore) is acceptable. But whatever you do, never use typedef
to define names for pointer types. It hides the extremely important piece of information that variables of this type reference an object which could be modified whenever you pass them to functions, and it makes dealing with differently-qualified (for instance, const
-qualified) versions of the pointer a major pain.
Solution 2:
Option 1.5 ("Object-based" C Architecture):
I am accustomed to using Option 1, except where you name your reference with _h
to signify it is a "handle" to a C-style "object" of this given C "class". Then, you ensure your function prototypes use const
wherever the content of this object "handle" is an input only, and cannot be changed, and don't use const
wherever the content can be changed. So, do this style:
// -------------
// my_module.h
// -------------
// An opaque pointer (handle) to a C-style "object" of "class" type
// "my_module" (struct my_module_s *, or my_module_h):
typedef struct my_module_s *my_module_h;
void doStuff1(my_module_h my_module);
void doStuff2(const my_module_h my_module);
// -------------
// my_module.c
// -------------
// Definition of the opaque struct "object" of C-style "class" "my_module".
struct my_module_s
{
int int1;
int int2;
float f1;
// etc. etc--add more "private" member variables as you see fit
};
Here's a full example using opaque pointers in C to create objects. The following architecture might be called "object-based C":
//==============================================================================================
// my_module.h
//==============================================================================================
// An opaque pointer (handle) to a C-style "object" of "class" type "my_module" (struct
// my_module_s *, or my_module_h):
typedef struct my_module_s *my_module_h;
// Create a new "object" of "class" "my_module": A function that takes a *pointer to* an
// "object" handle, `malloc`s memory for a new copy of the opaque `struct my_module_s`, then
// points the user's input handle (via its passed-in pointer) to this newly-created "object" of
// "class" "my_module".
void my_module_open(my_module_h * my_module_h_p);
// A function that takes this "object" (via its handle) as an input only and cannot modify it
void my_module_do_stuff1(const my_module_h my_module);
// A function that can modify the private content of this "object" (via its handle) (but still
// cannot modify the handle itself)
void my_module_do_stuff2(my_module_h my_module);
// Destroy the passed-in "object" of "class" type "my_module": A function that can close this
// object by stopping all operations, as required, and `free`ing its memory.
void my_module_close(my_module_h my_module);
//==============================================================================================
// my_module.c
//==============================================================================================
// Definition of the opaque struct "object" of C-style "class" "my_module".
// - NB: Since this is an opaque struct (declared in the header but not defined until the source
// file), it has the following 2 important properties:
// 1) It permits data hiding, wherein you end up with the equivalent of a C++ "class" with only
// *private* member variables.
// 2) Objects of this "class" can only be dynamically allocated. No static allocation is
// possible since any module including the header file does not know the contents of *nor the
// size of* (this is the critical part) this "class" (ie: C struct).
struct my_module_s
{
int my_private_int1;
int my_private_int2;
float my_private_float;
// etc. etc--add more "private" member variables as you see fit
};
void my_module_open(my_module_h * my_module_h_p)
{
// Ensure the passed-in pointer is not NULL (since it is a core dump/segmentation fault to
// try to dereference a NULL pointer)
if (!my_module_h_p)
{
// Print some error or store some error code here, and return it at the end of the
// function instead of returning void.
goto done;
}
// Now allocate the actual memory for a new my_module C object from the heap, thereby
// dynamically creating this C-style "object".
my_module_h my_module; // Create a local object handle (pointer to a struct)
// Dynamically allocate memory for the full contents of the struct "object"
my_module = malloc(sizeof(*my_module));
if (!my_module)
{
// Malloc failed due to out-of-memory. Print some error or store some error code here,
// and return it at the end of the function instead of returning void.
goto done;
}
// Initialize all memory to zero (OR just use `calloc()` instead of `malloc()` above!)
memset(my_module, 0, sizeof(*my_module));
// Now pass out this object to the user, and exit.
*my_module_h_p = my_module;
done:
}
void my_module_do_stuff1(const my_module_h my_module)
{
// Ensure my_module is not a NULL pointer.
if (!my_module)
{
goto done;
}
// Do stuff where you use my_module private "member" variables.
// Ex: use `my_module->my_private_int1` here, or `my_module->my_private_float`, etc.
done:
}
void my_module_do_stuff2(my_module_h my_module)
{
// Ensure my_module is not a NULL pointer.
if (!my_module)
{
goto done;
}
// Do stuff where you use AND UPDATE my_module private "member" variables.
// Ex:
my_module->my_private_int1 = 7;
my_module->my_private_float = 3.14159;
// Etc.
done:
}
void my_module_close(my_module_h my_module)
{
// Ensure my_module is not a NULL pointer.
if (!my_module)
{
goto done;
}
free(my_module);
done:
}
Simplified example usage:
#include "my_module.h"
#include <stdbool.h>
#include <stdio.h>
int main()
{
printf("Hello World\n");
bool exit_now = false;
// setup/initialization
my_module_h my_module = NULL;
// For safety-critical and real-time embedded systems, it is **critical** that you ONLY call
// the `_open()` functions during **initialization**, but NOT during normal run-time,
// so that once the system is initialized and up-and-running, you can safely know that
// no more dynamic-memory allocation, which is non-deterministic and can lead to crashes,
// will occur.
my_module_open(&my_module);
// Ensure initialization was successful and `my_module` is no longer NULL.
if (!my_module)
{
// await connection of debugger, or automatic system power reset by watchdog
log_errors_and_enter_infinite_loop();
}
// run the program in this infinite main loop
while (exit_now == false)
{
my_module_do_stuff1(my_module);
my_module_do_stuff2(my_module);
}
// program clean-up; will only be reached in this case in the event of a major system
// problem, which triggers the infinite main loop above to `break` or exit via the
// `exit_now` variable
my_module_close(my_module);
// for microcontrollers or other low-level embedded systems, we can never return,
// so enter infinite loop instead
while (true) {}; // await reset by watchdog
return 0;
}
The only improvements beyond this would be to:
-
Implement full error handling and return the error instead of
void
. Ex:/// @brief my_module error codes typedef enum my_module_error_e { /// No error MY_MODULE_ERROR_OK = 0, /// Invalid Arguments (ex: NULL pointer passed in where a valid pointer is required) MY_MODULE_ERROR_INVARG, /// Out of memory MY_MODULE_ERROR_NOMEM, /// etc. etc. MY_MODULE_ERROR_PROBLEM1, } my_module_error_t;
Now, instead of returning a
void
type in all of the functions above and below, return amy_module_error_t
error type instead! -
Add a configuration struct called
my_module_config_t
to the .h file, and pass it in to theopen
function to update internal variables when you create a new object. This helps encapsulate all configuration variables in a single struct for cleanliness when calling_open()
.Example:
//-------------------- // my_module.h //-------------------- // my_module configuration struct typedef struct my_module_config_s { int my_config_param_int; float my_config_param_float; } my_module_config_t; my_module_error_t my_module_open(my_module_h * my_module_h_p, const my_module_config_t *config); //-------------------- // my_module.c //-------------------- my_module_error_t my_module_open(my_module_h * my_module_h_p, const my_module_config_t *config) { my_module_error_t err = MY_MODULE_ERROR_OK; // Ensure the passed-in pointer is not NULL (since it is a core dump/segmentation fault // to try to dereference a NULL pointer) if (!my_module_h_p) { // Print some error or store some error code here, and return it at the end of the // function instead of returning void. Ex: err = MY_MODULE_ERROR_INVARG; goto done; } // Now allocate the actual memory for a new my_module C object from the heap, thereby // dynamically creating this C-style "object". my_module_h my_module; // Create a local object handle (pointer to a struct) // Dynamically allocate memory for the full contents of the struct "object" my_module = malloc(sizeof(*my_module)); if (!my_module) { // Malloc failed due to out-of-memory. Print some error or store some error code // here, and return it at the end of the function instead of returning void. Ex: err = MY_MODULE_ERROR_NOMEM; goto done; } // Initialize all memory to zero (OR just use `calloc()` instead of `malloc()` above!) memset(my_module, 0, sizeof(*my_module)); // Now initialize the object with values per the config struct passed in. Set these // private variables inside `my_module` to whatever they need to be. You get the idea... my_module->my_private_int1 = config->my_config_param_int; my_module->my_private_int2 = config->my_config_param_int*3/2; my_module->my_private_float = config->my_config_param_float; // etc etc // Now pass out this object handle to the user, and exit. *my_module_h_p = my_module; done: return err; }
And usage:
my_module_error_t err = MY_MODULE_ERROR_OK; my_module_h my_module = NULL; my_module_config_t my_module_config = { .my_config_param_int = 7, .my_config_param_float = 13.1278, }; err = my_module_open(&my_module, &my_module_config); if (err != MY_MODULE_ERROR_OK) { switch (err) { case MY_MODULE_ERROR_INVARG: printf("MY_MODULE_ERROR_INVARG\n"); break; case MY_MODULE_ERROR_NOMEM: printf("MY_MODULE_ERROR_NOMEM\n"); break; case MY_MODULE_ERROR_PROBLEM1: printf("MY_MODULE_ERROR_PROBLEM1\n"); break; case MY_MODULE_ERROR_OK: // not reachable, but included so that when you compile with // `-Wall -Wextra -Werror`, the compiler will fail to build if you forget to handle // any of the error codes in this switch statement. break; } // Do whatever else you need to in the event of an error, here. Ex: // await connection of debugger, or automatic system power reset by watchdog while (true) {}; } // ...continue other module initialization, and enter main loop
See also:
- [another answer of mine which references my answer above] Architectural considerations and approaches to opaque structs and data hiding in C
Additional reading on object-based C architecture:
- Providing helper functions when rolling out own structures
Additional reading and justification for valid usage of goto
in error handling for professional code:
- An argument in favor of the use of
goto
in C for error handling: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/blob/master/Research_General/goto_for_error_handling_in_C/readme.md - *****EXCELLENT ARTICLE showing the virtues of using
goto
in error handling in C: "Using goto for error handling in C" - https://eli.thegreenplace.net/2009/04/27/using-goto-for-error-handling-in-c - Valid use of goto for error management in C?
- Error handling in C code
Search terms to make more googlable: opaque pointer in C, opaque struct in C, typedef enum in C, error handling in C, c architecture, object-based c architecture, dynamic memory allocation at initialization architecture in c
Solution 3:
bar(const fooRef)
declares an immutable address as argument. bar(const foo *)
declares an address of an immutable foo as argument.
For this reason, I tend to prefer option 2. I.e., the presented interface type is one where cv-ness can be specified at each level of indirection. Of course one can sidestep the option 1 library writer and just use foo
, opening yourself to all sorts of horror when the library writer changes the implementation. (I.e., the option 1 library writer only perceives that fooRef
is part of the invariant interface and that foo
can come, go, be altered, whatever. The option 2 library writer perceives that foo
is part of the invariant interface.)
I'm more surprised that no one's suggested combined typedef/struct constructions.typedef struct { ... } foo;