Why is the alphabet split into multiple ranges in this C code?

In a custom library I saw an implementation:

inline int is_upper_alpha(char chValue)
{
    if (((chValue >= 'A') && (chValue <= 'I')) ||
        ((chValue >= 'J') && (chValue <= 'R')) ||
        ((chValue >= 'S') && (chValue <= 'Z')))
        return 1;
    return 0;
}

Is that an Easter egg or what are the advantages vs standard C/C++ method?

inline int is_upper_alpha(char chValue)
{
    return ((chValue >= 'A') && (chValue <= 'Z'));
}

The author of this code presumably had to support EBCDIC at some point, where the numeric values of the letters are non-contiguous (gaps exist between I, J and R, S, as you may have guessed).

It is worth noting that the C and C++ standards only guarantee that the characters 0 to 9 have contiguous numeric values for precisely this reason, so neither of these methods is strictly standard-conforming.

Looks like it attempts to cover both EBCDIC and ASCII. Your alternative method doesn't work for EBCDIC (it has false positives, but no false negatives)

C and C++ do require that '0'-'9' are contiguous.

Note that the standard library calls do know whether they run on ASCII, EBCDIC or other systems, so they're more portable and possibly more efficient.

Why is the alphabet split into multiple ranges in this C code?

Related

Recent Posts