C state-machine design [closed]

I am crafting a small project in mixed C and C++. I am building one small-ish state-machine at the heart of one of my worker thread.

I was wondering if you gurus on SO would share your state-machine design techniques.

NOTE: I am primarily after tried & tested implementation techniques.

UPDATED: Based on all the great input gathered on SO, I've settled on this architecture:

An event pump points to an event integrator which points to a dispatcher. The dispatcher points to 1 through n actions which point back to the event integrator. A transition table with wildcards points to the dispatcher.


State machines that I've designed before (C, not C++) have all come down to a struct array and a loop. The structure basically consists of a state and event (for look-up) and a function that returns the new state, something like:

typedef struct {
    int st;
    int ev;
    int (*fn)(void);
} tTransition;

Then you define your states and events with simple defines (the ANY ones are special markers, see below):

#define ST_ANY              -1
#define ST_INIT              0
#define ST_ERROR             1
#define ST_TERM              2
: :
#define EV_ANY              -1
#define EV_KEYPRESS       5000
#define EV_MOUSEMOVE      5001

Then you define all the functions that are called by the transitions:

static int GotKey (void) { ... };
static int FsmError (void) { ... };

All these function are written to take no variables and return the new state for the state machine. In this example global variables are used for passing any information into the state functions where necessary.

Using globals isn't as bad as it sounds since the FSM is usually locked up inside a single compilation unit and all variables are static to that unit (which is why I used quotes around "global" above - they're more shared within the FSM, than truly global). As with all globals, it requires care.

The transitions array then defines all possible transitions and the functions that get called for those transitions (including the catch-all last one):

tTransition trans[] = {
    { ST_INIT, EV_KEYPRESS, &GotKey},
    : :
    { ST_ANY, EV_ANY, &FsmError}
};
#define TRANS_COUNT (sizeof(trans)/sizeof(*trans))

What that means is: if you're in the ST_INIT state and you receive the EV_KEYPRESS event, make a call to GotKey.

The workings of the FSM then become a relatively simple loop:

state = ST_INIT;
while (state != ST_TERM) {
    event = GetNextEvent();
    for (i = 0; i < TRANS_COUNT; i++) {
        if ((state == trans[i].st) || (ST_ANY == trans[i].st)) {
            if ((event == trans[i].ev) || (EV_ANY == trans[i].ev)) {
                state = (trans[i].fn)();
                break;
            }
        }
    }
}

As alluded to above, note the use of ST_ANY as wild-cards, allowing an event to call a function no matter the current state. EV_ANY also works similarly, allowing any event at a specific state to call a function.

It can also guarantee that, if you reach the end of the transitions array, you get an error stating your FSM hasn't been built correctly (by using the ST_ANY/EV_ANY combination.

I've used code similar for this on a great many communications projects, such as an early implementation of communications stacks and protocols for embedded systems. The big advantage was its simplicity and relative ease in changing the transitions array.

I've no doubt there will be higher-level abstractions which may be more suitable nowadays but I suspect they'll all boil down to this same sort of structure.


And, as ldog states in a comment, you can avoid the globals altogether by passing a structure pointer to all functions (and using that in the event loop). This will allow multiple state machines to run side-by-side without interference.

Just create a structure type which holds the machine-specific data (state at a bare minimum) and use that instead of the globals.

The reason I've rarely done that is simply because most of the state machines I've written have been singleton types (one-off, at-process-start, configuration file reading for example), not needing to run more than one instance. But it has value if you need to run more than one.


The other answers are good, but a very "lightweight" implementation I've used when the state machine is very simple looks like:

enum state { ST_NEW, ST_OPEN, ST_SHIFT, ST_END };

enum state current_state = ST_NEW;

while (current_state != ST_END)
{
    input = get_input();

    switch (current_state)
    {
        case ST_NEW:
        /* Do something with input and set current_state */
        break;

        case ST_OPEN:
        /* Do something different and set current_state */
        break;

        /* ... etc ... */
    }
}

I would use this when the state machine is simple enough that the function pointer & state transition table approach is overkill. This is often useful for character-by-character or word-by-word parsing.


Pardon me for breaking every rule in computer science, but a state machine is one of the few (I can count only two off hand) places where a goto statement is not only more efficient, but also makes your code cleaner and easier to read. Because goto statements are based on labels, you can name your states instead of having to keep track of a mess of numbers or use an enum. It also makes for much cleaner code since you don't need all the extra cruft of function pointers or huge switch statements and while loops. Did I mention it's more efficient too?

Here's what a state machine might look like:

void state_machine() {
first_state:
    // Do some stuff here
    switch(some_var) {
    case 0:
        goto first_state;
    case 1:
        goto second_state;
    default:
        return;
    }

second_state:
    // Do some stuff here
    switch(some_var) {
    case 0:
        goto first_state;
    case 1:
        goto second_state;
    default:
        return;
    }
}

You get the general idea. The point is that you can implement the state machine in an efficient way and one that is relatively easy to read and screams at the reader that they are looking at a state machine. Note that if you are using goto statements, you must still be careful as it is very easy to shoot yourself in the foot while doing so.


You might consider the State Machine Compiler http://smc.sourceforge.net/

This splendid open source utility accepts a description of a state machine in a simple language and compiles it to any one of a dozen or so languages - including C and C++. The utility itself is written in Java, and can be included as part of a build.

The reason to do this, rather than hand coding using GoF State pattern or any other approach, is that once your state machine is expressed as code, the underlying structure tends to disappear under the weight of boilerplate that needs to be generated to support it. Using this approach gives you an excellent separation of concerns, and you keep the structure of your state machine 'visible'. The auto-generated code goes into modules that you don't need to touch, so that you can go back and fiddle with the state machine's structure without impacting the supporting code that you have written.

Sorry, I am being over-enthusiastic, and doubtless putting everyone off. But it is a top notch utility, and well-documented too.


Be sure to check the work of Miro Samek (blog State Space, website State Machines & Tools), whose articles at the C/C++ Users Journal were great.

The website contains a complete (C/C++) implementation in both open source and commercial license of a state machine framework (QP Framework), an event handler (QEP), a basic modeling tool (QM) and a tracing tool (QSpy) which allow to draw state machines, create code and debug them.

The book contains an extensive explanation on the what/why of the implementation and how to use it and is also great material to gain understanding of the fundamentals of hierachical and finite state machines.

The website also contains links to several board support packages for use of the software with embedded platforms.