Switch statement fallthrough in C#?
Switch statement fallthrough is one of my personal major reasons for loving switch
vs. if/else if
constructs. An example is in order here:
static string NumberToWords(int number)
{
string[] numbers = new string[]
{ "", "one", "two", "three", "four", "five",
"six", "seven", "eight", "nine" };
string[] tens = new string[]
{ "", "", "twenty", "thirty", "forty", "fifty",
"sixty", "seventy", "eighty", "ninety" };
string[] teens = new string[]
{ "ten", "eleven", "twelve", "thirteen", "fourteen", "fifteen",
"sixteen", "seventeen", "eighteen", "nineteen" };
string ans = "";
switch (number.ToString().Length)
{
case 3:
ans += string.Format("{0} hundred and ", numbers[number / 100]);
case 2:
int t = (number / 10) % 10;
if (t == 1)
{
ans += teens[number % 10];
break;
}
else if (t > 1)
ans += string.Format("{0}-", tens[t]);
case 1:
int o = number % 10;
ans += numbers[o];
break;
default:
throw new ArgumentException("number");
}
return ans;
}
The smart people are cringing because the string[]
s should be declared outside the function: well, they are, this is just an example.
The compiler fails with the following error:
Control cannot fall through from one case label ('case 3:') to another Control cannot fall through from one case label ('case 2:') to another
Why? And is there any way to get this sort of behaviour without having three if
s?
Solution 1:
(Copy/paste of an answer I provided elsewhere)
Falling through switch
-case
s can be achieved by having no code in a case
(see case 0
), or using the special goto case
(see case 1
) or goto default
(see case 2
) forms:
switch (/*...*/) {
case 0: // shares the exact same code as case 1
case 1:
// do something
goto case 2;
case 2:
// do something else
goto default;
default:
// do something entirely different
break;
}
Solution 2:
The "why" is to avoid accidental fall-through, for which I'm grateful. This is a not uncommon source of bugs in C and Java.
The workaround is to use goto, e.g.
switch (number.ToString().Length)
{
case 3:
ans += string.Format("{0} hundred and ", numbers[number / 100]);
goto case 2;
case 2:
// Etc
}
The general design of switch/case is a little bit unfortunate in my view. It stuck too close to C - there are some useful changes which could be made in terms of scoping etc. Arguably a smarter switch which could do pattern matching etc would be helpful, but that's really changing from switch to "check a sequence of conditions" - at which point a different name would perhaps be called for.
Solution 3:
To add to the answers here, I think it's worth considering the opposite question in conjunction with this, viz. why did C allow fall-through in the first place?
Any programming language of course serves two goals:
- Provide instructions to the computer.
- Leave a record of the intentions of the programmer.
The creation of any programming language is therefore a balance between how to best serve these two goals. On the one hand, the easier it is to turn into computer instructions (whether those are machine code, bytecode like IL, or the instructions are interpreted on execution) then more able that process of compilation or interpretation will be to be efficient, reliable and compact in output. Taken to its extreme, this goal results in our just writing in assembly, IL, or even raw op-codes, because the easiest compilation is where there is no compilation at all.
Conversely, the more the language expresses the intention of the programmer, rather than the means taken to that end, the more understandable the program both when writing and during maintenance.
Now, switch
could always have been compiled by converting it into the equivalent chain of if-else
blocks or similar, but it was designed as allowing compilation into a particular common assembly pattern where one takes a value, computes an offset from it (whether by looking up a table indexed by a perfect hash of the value, or by actual arithmetic on the value*). It's worth noting at this point that today, C# compilation will sometimes turn switch
into the equivalent if-else
, and sometimes use a hash-based jump approach (and likewise with C, C++, and other languages with comparable syntax).
In this case there are two good reasons for allowing fall-through:
-
It just happens naturally anyway: if you build a jump table into a set of instructions, and one of the earlier batches of instructions doesn't contain some sort of jump or return, then execution will just naturally progress into the next batch. Allowing fall-through was what would "just happen" if you turned the
switch
-using C into jump-table–using machine code. -
Coders who wrote in assembly were already used to the equivalent: when writing a jump table by hand in assembly, they would have to consider whether a given block of code would end with a return, a jump outside of the table, or just continue on to the next block. As such, having the coder add an explicit
break
when necessary was "natural" for the coder too.
At the time therefore, it was a reasonable attempt to balance the two goals of a computer language as it relates to both the produced machine code, and the expressiveness of the source code.
Four decades later though, things are not quite the same, for a few reasons:
- Coders in C today may have little or no assembly experience. Coders in many other C-style languages are even less likely to (especially Javascript!). Any concept of "what people are used to from assembly" is no longer relevant.
- Improvements in optimisations mean that the likelihood of
switch
either being turned intoif-else
because it was deemed the approach likely to be most efficient, or else turned into a particularly esoteric variant of the jump-table approach are higher. The mapping between the higher- and lower-level approaches is not as strong as it once was. - Experience has shown that fall-through tends to be the minority case rather than the norm (a study of Sun's compiler found 3% of
switch
blocks used a fall-through other than multiple labels on the same block, and it was thought that the use-case here meant that this 3% was in fact much higher than normal). So the language as studied make the unusual more readily catered-to than the common. - Experience has shown that fall-through tends to be the source of problems both in cases where it is accidentally done, and also in cases where correct fall-through is missed by someone maintaining the code. This latter is a subtle addition to the bugs associated with fall-through, because even if your code is perfectly bug-free, your fall-through can still cause problems.
Related to those last two points, consider the following quote from the current edition of K&R:
Falling through from one case to another is not robust, being prone to disintegration when the program is modified. With the exception of multiple labels for a single computation, fall-throughs should be used sparingly, and commented.
As a matter of good form, put a break after the last case (the default here) even though it's logically unnecessary. Some day when another case gets added at the end, this bit of defensive programming will save you.
So, from the horse's mouth, fall-through in C is problematic. It's considered good practice to always document fall-throughs with comments, which is an application of the general principle that one should document where one does something unusual, because that's what will trip later examination of the code and/or make your code look like it has a novice's bug in it when it is in fact correct.
And when you think about it, code like this:
switch(x)
{
case 1:
foo();
/* FALLTHRU */
case 2:
bar();
break;
}
Is adding something to make the fall-through explicit in the code, it's just not something that can be detected (or whose absence can be detected) by the compiler.
As such, the fact that on has to be explicit with fall-through in C# doesn't add any penalty to people who wrote well in other C-style languages anyway, since they would already be explicit in their fall-throughs.†
Finally, the use of goto
here is already a norm from C and other such languages:
switch(x)
{
case 0:
case 1:
case 2:
foo();
goto below_six;
case 3:
bar();
goto below_six;
case 4:
baz();
/* FALLTHRU */
case 5:
below_six:
qux();
break;
default:
quux();
}
In this sort of case where we want a block to be included in the code executed for a value other than just that which brings one to the preceding block, then we're already having to use goto
. (Of course, there are means and ways of avoiding this with different conditionals but that's true of just about everything relating to this question). As such C# built on the already normal way to deal with one situation where we want to hit more than one block of code in a switch
, and just generalised it to cover fall-through as well. It also made both cases more convenient and self-documenting, since we have to add a new label in C but can use the case
as a label in C#. In C# we can get rid of the below_six
label and use goto case 5
which is clearer as to what we are doing. (We'd also have to add break
for the default
, which I left out just to make the above C code clearly not C# code).
In summary therefore:
- C# no longer relates to unoptimised compiler output as directly as C code did 40 years ago (nor does C these days), which makes one of the inspirations of fall-through irrelevant.
- C# remains compatible with C in not just having implicit
break
, for easier learning of the language by those familiar with similar languages, and easier porting. - C# removes a possible source of bugs or misunderstood code that has been well-documented as causing problems for the last four decades.
- C# makes existing best-practice with C (document fall through) enforceable by the compiler.
- C# makes the unusual case the one with more explicit code, the usual case the one with the code one just writes automatically.
- C# uses the same
goto
-based approach for hitting the same block from differentcase
labels as is used in C. It just generalises it to some other cases. - C# makes that
goto
-based approach more convenient, and clearer, than it is in C, by allowingcase
statements to act as labels.
All in all, a pretty reasonable design decision
*Some forms of BASIC would allow one to do the likes of GOTO (x AND 7) * 50 + 240
which while brittle and hence a particularly persuasive case for banning goto
, does serve to show a higher-language equivalent of the sort of way that lower-level code can make a jump based on arithmetic upon a value, which is much more reasonable when it's the result of compilation rather than something that has to be maintained manually. Implementations of Duff's Device in particular lend themselves well to the equivalent machine code or IL because each block of instructions will often be the same length without needing the addition of nop
fillers.
†Duff's Device comes up here again, as a reasonable exception. The fact that with that and similar patterns there's a repetition of operations serves to make the use of fall-through relatively clear even without an explicit comment to that effect.