Is using a labeled break a good practice in Java?
I'm staring at some old code from 2001 and came across this statement:
else {
do {
int c = XMLDocumentFragmentScannerImpl.this.scanContent();
if (c == 60) {
XMLDocumentFragmentScannerImpl.this.fEntityScanner.scanChar();
XMLDocumentFragmentScannerImpl.this.setScannerState(1);
break label913;
}
I had never seeen this before, and discovered labeled breaks here:
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/branch.html
Doesn't this essentially function like goto
? Is it even good practice to use it? It makes me uneasy.
No, it's not like a goto, since you can't "go" to another part of the control flow.
From the page you linked:
The break statement terminates the labeled statement; it does not transfer the flow of control to the label. Control flow is transferred to the statement immediately following the labeled (terminated) statement.
This means you can only break loops that are currently being executed.
Consider this example:
first:
for( int i = 0; i < 10; i++) {
second:
for(int j = 0; j < 5; j ++ )
{
break xxx;
}
}
third:
for( int a = 0; a < 10; a++) {
}
You could replace xxx
with first
or second
(to break the outer or inner loop), since both loops are being executed, when you hit the break
statement, but replacing xxx
with third
won't compile.
It's not as terrible as goto
because it only sends control to the end of the labeled statement (typically a loop construct). The thing that makes goto
unlikeable is that it is an arbitrary branch to anywhere, including labels found higher up in the method source, so that you can have home-grown looping behavior. The label-break functionality in Java doesn't allow that kind of craziness because control only goes forward.
I've only used it once about 12 years ago, in a situation where I was needing to break out of nested loops, the more-structured alternative would've made for more complicated tests in the loops. I wouldn't advise using it a lot but I wouldn't flag it as an automatic bad-code smell.
It's always possible to replace break
with new method.
Consider the code checking for any common elements in two lists:
List list1, list2;
boolean foundCommonElement = false;
for (Object el : list1) {
if (list2.contains(el)) {
foundCommonElement = true;
break;
}
}
You can rewrite it like this:
boolean haveCommonElement(List list1, List list2) {
for (Object el : list1) {
if (list2.contains(el)) {
return true;
}
}
return false;
}
Of course, to check for common elements between two lists it's better to use list1.retainAll(new HashSet<>(list2))
method to do that at O(n)
with O(n)
extra memory, or sort both lists at O(n * log n)
and then find common elements at O(n)
.
Before reading the rest of this answer, please read Go To Statement Considered Harmful. If you don't want to read it in full, here is what I consider the key point:
The unbridled use of the go to statement has an immediate consequence that it becomes terribly hard to find a meaningful set of coordinates in which to describe the process progress.
Or to rephrase, the problem with goto
is that the program can arrive in the middle of a block of code, without the programmer understanding the program state at that point. The standard block-oriented constructs are designed to clearly delineate state transitions, a labeled break
is intended to take the program to a specific known state (outside of the containing labeled block).
In a real-world imperative program, state is not clearly delineated by block boundaries, so it is questionable whether the labeled break
is a good idea. If the block changes state that is visible from outside the block, and there are multiple points to exit the block, then a labeled break
is equivalent to the primitive goto
. The only difference is that, rather than the chance of landing in the middle of a block with indeterminate state, you start a new block with indeterminate state.
So, in general, I would consider a labeled break
as dangerous. In my opinion it is a sign that the block should be converted into a function, with limited access to enclosing scope.
However, this example code was clearly the product of a parser generator (the OP commented that it was Xerces source code). And parser generators (or code generators in general) often take liberties with the code they generate, because they have perfect knowledge of state and humans don't need to understand it.