Do you really understand what your software does?

Apr 3, 2022
5 min read

A motivation for declarative state machines - centralize your state transition logic!

I’d like to think I’ve been incredibly lucky the first place I worked. For a small startup they had a lot of stuff done ‘right’. Unit tests, CI, fast release cycles. KANBAN with retros and planning as necessary. A focus on quality and testing ideas in the market. But from an algorithmic perspective, I think the biggest thing I learned there was to make explicit state machines.

One of the constraints due to the platform that the code was running on was that all functions needed to return within 0.5s, otherwise a watchdog would reset the system. We still had shared memory which could save state in between, and some functions such as network and file IO could operate asynchronously, but as a result many of the systems were designed as state machines. As the state machines, they would often be structured as something like:

enum State
{
    Startup, MainProcessing, Cleanup
};

void process()
{
    switch(_state)
    {
        case Startup:
            // do the startup stuff;
            _state = MainProcessing;
            break;
        case MainProcessing:
            // start some tasks
            // maybe resume some stuff from last time
            // maybe finalize some stuff from last time

            if (mainProcessingDone())
            {
                _state = Cleanup;
            }
            break;
        case Cleanup:
            // reset everything back to 0
            _state = Startup;
            break;
    }
}

Then just by calling process(); repeatedly you can get all your code to run in a nice, structured manner. You get to resume your code where you were last time. You know if it’s busy, finished or ready to start again. And overall, as a design pattern, this worked really well.

Of course, once you start to have a bunch of these in your code, there are things you want to make better. One of these is having some central place that makes it easy to view the state transitions. Having to read through all the code, and the switch cases, to figure out which states could follow each other, feels a bit clumsy. Introducing the concept of a ‘transition’ makes this a lot simpler. It means that we can structure our state changing logic into configuration instead: a table listing the valid transition from each state to each other state can go from being documentation, to actually being something that drives the state machine behavior.

enum Transition
{
    Repeat, Continue, Error, NoData, Timeout, YourCustomTransitionNameHere
};

std::unordered_map<std::pair<State, Transition>, State> _transitionTable;

void setNextState(Transition transition)
{
    if (transition == Repeat)
        return;

    auto iter = _transitionTable.find(std::make_pair(_state, transition));
    if (iter != _transitionTable.end())
    {
        _state = iter->second;
    }
    else
    {
        throw std::logic_error("Illegal state transition attempted");
    }
}

Then we just need a big list of all the initial state + transition pairs to the final state.

void initTransitions()
{
    _transitionTable[std::make_pair(Startup, Continue)] = MainProcessing;
    _transitionTable[std::make_pair(MainProcessing, Continue)] = Cleanup;
    _transitionTable[std::make_pair(Cleanup, Continue)] = Startup;
}

While this is of course complete overkill for such a simple state machine, but if we were getting towards a much more complicated state machine then this would be way easier to manage the transitions.

What else can we do with our centralized state machine logic? Well, we can clean up all that transition table and make_pair nonsense with a macro. Once it’s nice and structured, we can also easily parse it and make some pretty state transition diagrams.

Now that I’ve bogged you down with all of the code, lets completely overwhelm you with some motivation for why explicit state machines with centralized transition logic are good.

It lets you quit jobs partway though. By dropping up to a top-level of processing occasionally, you don’t need pre-emptive or some other type of nasty mechanism to quit your processing. Just check whether you’re meant to be quitting each time you tick over your state machine.
Logging. You can easily put a toString() on your state machine, and each time you begin to execute a state you can log the state you’re running. This means you can easily trace your program state which makes debugging really easy.
Avoiding spaghetti. Sure, a state machine is always a bit spaghetti-ish, but at least you can avoid tying yourself in knots. Having the transitions all laid out simply and easily in a table means that the complication and the difficulty of comprehension is all in the actual problem you’re solving, and not the implementation.
Making it easy to add more states. From experience, the main reasons state machines get out of hand is, counter-intuitively, they don’t have enough states. Putting too much logic into a single state can cause serious issues, and you end up not being able to trace what happened. Rough rule of thumb, if you have an if/else in a state with substantial code in either, rather use the if/else to return two different transitions and handle the different code paths with the state machine, not in the if/else.
Having explicit names for your state transitions. Again from experience, this means people spend a little more time thinking about what they want their code to do. Better upfront decisions means less refactoring in the future.

If you want to use my 4th generation state machine, be warned that it’s under 100 lines of code. Mostly what it does it set you up on the right path to have a maintainable state machine design. Copy-paste the tests so that you can get started on the right track. You can find it here: https://github.com/jkflying/usm , it’s available under a BSD-3 license so go wild.