Case sensitivity of Java class names

If one writes two public Java classes with the same case-insensitive name in different directories then both classes are not usable at runtime. (I tested this on Windows, Mac and Linux with several versions of the HotSpot JVM. I would not be surprised if there other JVMs where they are usable simultaneously.) For example, if I create a class named a and one named A like so:

// lowercase/src/testcase/a.java
package testcase;
public class a {
    public static String myCase() {
        return "lower";
    }
}

// uppercase/src/testcase/A.java 
package testcase;
public class A {
    public static String myCase() {
        return "upper";
    }
}

Three eclipse projects containing the code above are available from my website.

If try I calling myCase on both classes like so:

System.out.println(A.myCase());
System.out.println(a.myCase());

The typechecker succeeds, but when I run the class file generate by the code directly above I get:

Exception in thread "main" java.lang.NoClassDefFoundError: testcase/A (wrong name: testcase/a)

In Java, names are in general case sensitive. Some file systems (e.g. Windows) are case insensitive, so I'm not surprised the above behavior happens, but it seems wrong. Unfortunately the Java specifications are oddly non-commital about which classes are visible. The Java Language Specification (JLS), Java SE 7 Edition (Section 6.6.1, page 166) says:

If a class or interface type is declared public, then it may be accessed by any code, provided that the compilation unit (§7.3) in which it is declared is observable.

In Section 7.3, the JLS defines observability of a compilation unit in extremely vague terms:

All the compilation units of the predefined package java and its subpackages lang and io are always observable. For all other packages, the host system determines which compilation units are observable.

The Java Virtual Machine Specification is similarly vague (Section 5.3.1):

The following steps are used to load and thereby create the nonarray class or interface C denoted by [binary name] N using the bootstrap class loader [...] Otherwise, the Java virtual machine passes the argument N to an invocation of a method on the bootstrap class loader to search for a purported representation of C in a platform-dependent manner.

All of this leads to four questions in descending order of importance:

  1. Are there any guarantees about which classes are loadable by the default class loader(s) in every JVM? In other words, can I implement a valid, but degenerate JVM, that won't load any classes except those in java.lang and java.io?
  2. If there are any guarantees, does the behavior in the example above violate the guarantee (i.e. is the behavior a bug)?
  3. Is there any way to make HotSpot load a and A simultaneously? Would writing a custom class loader work?

Solution 1:

  • Are there any guarantees about which classes are loadable by the bootstrap class loader in every JVM?

The core bits and pieces of the language, plus supporting implementation classes. Not guaranteed to include any class that you write. (The normal JVM loads your classes in a separate classloader from the bootstrap one, and in fact the normal bootstrap loader loads its classes out of a JAR normally, as this makes for more efficient deployment than a big old directory structure full of classes.)

  • If there are any guarantees, does the behavior in the example above violate the guarantee (i.e. is the behavior a bug)?
  • Is there any way to make "standard" JVMs load a and A simultaneously? Would writing a custom class loader work?

Java loads classes by mapping the full name of the class into a filename that is then searched for on the classpath. Thus testcase.a goes to testcase/a.class and testcase.A goes to testcase/A.class. Some filesystems mix these things up, and may serve the other up when one is asked for. Others get it right (in particular, the variant of the ZIP format used in JAR files is fully case-sensitive and portable). There is nothing that Java can do about this (though an IDE could handle it for you by keeping the .class files away from the native FS, I don't know if any actually do and the JDK's javac most certainly isn't that smart).

However that's not the only point to note here: class files know internally what class they are talking about. The absence of the expected class from the file just means that the load fails, leading to the NoClassDefFoundError you received. What you got was a problem (a mis-deployment in at least some sense) that was detected and dealt with robustly. Theoretically, you could build a classloader that could handle such things by keeping searching, but why bother? Putting the class files inside a JAR will fix things far more robustly; those are handled correctly.

More generally, if you're running into this problem for real a lot, take to doing production builds on a Unix with a case-sensitive filesystem (a CI system like Jenkins is recommended) and find which developers are naming classes with just case differences and make them stop as it is very confusing!

Solution 2:

Donal's fine explanation leaves little to add, but let me briefly muse on this phrase:

... Java classes with the same case-insensitive name ...

Names and Strings in general are never case-insensitive in themselves, it's only there interpretation that can be. And secondly, Java doesn't do such an interpretation.

So, a correct phrasing of what you had in mind would be:

... Java classes whose file representations in a case-insensitive file-system have identical names ...