How to prevent the arrowhead anti-pattern

I'm a bit confused about how to best refactor my code into something more readable.

Consider this piece of code:

var foo = getfoo();
if(foo!=null)
{
    var bar = getbar(foo);
    if(bar!=null)
    {
        var moo = getmoo(bar);
        if(moo!=null)
        {
            var cow = getcow(moo);
            ...
        }
    }
}
return;

As you can see, lots of nested if blocks, needed because each nested if is dependent on the previous values.

Now I was wondering how to make my code a bit cleaner in this regard.

Some options I have thought of myself would be to:

return after each if clause, meaning I'd have multiple places where I leave my method
throw ArgumentNullExceptions, after which I'd catch them in the end and place the return statement in my finally clause (or outside the try/catch block)
work with a label and goto:

Most of these options seem a bit "dirty" to me, so I was wondering if there was a good way to clean up this mess that I've created.

Solution 1:

I'd go for the multiple return statements. This makes the code easy to read and understand.

Don't use goto for obvious reasons.

Don't use exceptions because the check you are doing isn't exceptional, it's something you can expect so you should just take that into account. Programming against exceptions is also an anti-pattern.

Solution 2:

Consider inverting the null checks to:

var foo = getfoo();
if (foo == null)
{
    return;
}
var bar = getbar(foo);
if (bar == null)
{
    return;
}
...etc

Solution 3:

You can chain expressions. An assignment returns the assigned value, so you can check its result. Also, you can use the assigned variable in the next expression.

As soon as an expression returns false, the others are not longer executed, because the entire expression would return false already (because of the and operation).

So, something like this should work:

Foo foo; Bar bar; Moo moo; Cow cow;

if( (foo = getfoo()) != null &&
    (bar = getbar(foo)) != null &&
    (moo = getmoo(bar)) != null &&
    (cow = getcow(moo)) != null )
{
  ..
}

Solution 4:

This is one of few scenarios where it is perfectly acceptable (if not desirable) to use `goto`.

In functions like this, there will often be resources that are allocated or state changes that are made mid-way which need to be undone before the function exits.

The usual problem with return-based solutions (e.g., rexcfnghk's and Gerrie Schenck's) is that you need to remember to undo those state changes before every return. This leads to code duplication and opens the door to subtle errors, especially in larger functions. Do not do this.

CERT actually recommends a structural approach based on `goto`.

In particular, note their example code taken from copy_process in kernel/fork.c of the Linux kernel. A simplified version of the concept is as follows:

    if (!modify_state1(true))
        goto cleanup_none;
    if (!modify_state2(true))
        goto cleanup_state1;
    if (!modify_state3(true))
        goto cleanup_state2;

    // ...

cleanup_state3:
    modify_state3(false);
cleanup_state2:
    modify_state2(false);
cleanup_state1:
    modify_state1(false);
cleanup_none:
    return;

Essentially, this is just a more readable version of the "arrowhead" code that doesn't use unnecessary indentation or duplicate code. This concept can easily be extended to whatever best suits your situation.

As a final note, especially regarding CERT's first compliant example, I just want to add that, whenever possible, it is simpler to design your code so that the cleanup can be handled all at once. That way, you can write code like this:

    FILE *f1 = null;
    FILE *f2 = null;
    void *mem = null;

    if ((f1 = fopen(FILE1, "r")) == null)
        goto cleanup;
    if ((f2 = fopen(FILE2, "r")) == null)
        goto cleanup;
    if ((mem = malloc(OBJSIZE)) == null)
        goto cleanup;

    // ...

cleanup:
    free(mem); // These functions gracefully exit given null input
    close(f2);
    close(f1);
    return;

Solution 5:

First your suggestion (return after each if clause) is quite a good way out:

  // Contract (first check all the input)
  var foo = getfoo();

  if (Object.ReferenceEquals(null, foo))
    return; // <- Or throw exception, put assert etc.

  var bar = getbar(foo);

  if (Object.ReferenceEquals(null, bar))
    return; // <- Or throw exception, put assert etc.

  var moo = getmoo(bar);

  if (Object.ReferenceEquals(null, moo))
    return; // <- Or throw exception, put assert etc.

  // Routine: all instances (foo, bar, moo) are correct (not null) and we can work with them
  ...

The second possibility (in your very case) is to slightly modify your getbar() as well as getmoo() functions such that they return null on null input, so you'll have

  var foo = getfoo();
  var bar = getbar(foo); // return null if foo is null
  var moo = getmoo(bar); // return null if bar is null

  if ((foo == null) || (bar == null) || (moo == null))
    return; // <- Or throw exception, put assert(s) etc.    

  // Routine: all instances (foo, bar, moo) are correct (not null)
  ...

The third possibility is that in complicated cases you can use Null Object Desing Patteren

http://en.wikipedia.org/wiki/Null_Object_pattern

How to prevent the arrowhead anti-pattern

Solution 1:

Solution 2:

Solution 3:

Solution 4:

This is one of few scenarios where it is perfectly acceptable (if not desirable) to use `goto`.

CERT actually recommends a structural approach based on `goto`.

Solution 5:

Related

Recent Posts

How to prevent the arrowhead anti-pattern

Solution 1:

Solution 2:

Solution 3:

Solution 4:

This is one of few scenarios where it is perfectly acceptable (if not desirable) to use goto.

CERT actually recommends a structural approach based on goto.

Solution 5:

Related

Recent Posts

This is one of few scenarios where it is perfectly acceptable (if not desirable) to use `goto`.

CERT actually recommends a structural approach based on `goto`.