What is the difference between canonical name, simple name and class name in Java Class?

In Java, what is the difference between these:

Object o1 = ....
o1.getClass().getSimpleName();
o1.getClass().getName();
o1.getClass().getCanonicalName();

I have checked the Javadoc multiple times and yet this never explains it well. I also ran a test and that didn't reflect any real meaning behind the way these methods are called.

If you're unsure about something, try writing a test first.

I did this:

class ClassNameTest {
    public static void main(final String... arguments) {
        printNamesForClass(
            int.class,
            "int.class (primitive)");
        printNamesForClass(
            String.class,
            "String.class (ordinary class)");
        printNamesForClass(
            java.util.HashMap.SimpleEntry.class,
            "java.util.HashMap.SimpleEntry.class (nested class)");
        printNamesForClass(
            new java.io.Serializable(){}.getClass(),
            "new java.io.Serializable(){}.getClass() (anonymous inner class)");
    }

    private static void printNamesForClass(final Class<?> clazz, final String label) {
        System.out.println(label + ":");
        System.out.println("    getName():          " + clazz.getName());
        System.out.println("    getCanonicalName(): " + clazz.getCanonicalName());
        System.out.println("    getSimpleName():    " + clazz.getSimpleName());
        System.out.println("    getTypeName():      " + clazz.getTypeName()); // added in Java 8
        System.out.println();
    }
}

Prints:

int.class (primitive):
    getName():          int
    getCanonicalName(): int
    getSimpleName():    int
    getTypeName():      int

String.class (ordinary class):
    getName():          java.lang.String
    getCanonicalName(): java.lang.String
    getSimpleName():    String
    getTypeName():      java.lang.String

java.util.HashMap.SimpleEntry.class (nested class):
    getName():          java.util.AbstractMap$SimpleEntry
    getCanonicalName(): java.util.AbstractMap.SimpleEntry
    getSimpleName():    SimpleEntry
    getTypeName():      java.util.AbstractMap$SimpleEntry

new java.io.Serializable(){}.getClass() (anonymous inner class):
    getName():          ClassNameTest$1
    getCanonicalName(): null
    getSimpleName():    
    getTypeName():      ClassNameTest$1

There's an empty entry in the last block where getSimpleName returns an empty string.

The upshot looking at this is:

the name is the name that you'd use to dynamically load the class with, for example, a call to Class.forName with the default ClassLoader. Within the scope of a certain ClassLoader, all classes have unique names.

the canonical name is the name that would be used in an import statement. It might be useful during toString or logging operations. When the javac compiler has complete view of a classpath, it enforces uniqueness of canonical names within it by clashing fully qualified class and package names at compile time. However JVMs must accept such name clashes, and thus canonical names do not uniquely identify classes within a ClassLoader. (In hindsight, a better name for this getter would have been getJavaName; but this method dates from a time when the JVM was used solely to run Java programs.)

the simple name loosely identifies the class, again might be useful during toString or logging operations but is not guaranteed to be unique.

the type name returns "an informative string for the name of this type", "It's like toString: it's purely informative and has no contract value". (as written by sir4ur0n)

Also you can commonly reference the Java Language Specification documentation for these types technical Java API details:

Here's the Java 11 Specification on this subject matter: https://docs.oracle.com/javase/specs/jls/se11/html/jls-6.html#jls-6.7

Example 6.7-2. and Example 6.7-2. goes over Fully Qualified Names and Fully Qualified Names v. Canonical Name respectively

Adding local classes, lambdas and the toString() method to complete the previous two answers. Further, I add arrays of lambdas and arrays of anonymous classes (which do not make any sense in practice though):

package com.example;

public final class TestClassNames {
    private static void showClass(Class<?> c) {
        System.out.println("getName():          " + c.getName());
        System.out.println("getCanonicalName(): " + c.getCanonicalName());
        System.out.println("getSimpleName():    " + c.getSimpleName());
        System.out.println("toString():         " + c.toString());
        System.out.println();
    }

    private static void x(Runnable r) {
        showClass(r.getClass());
        showClass(java.lang.reflect.Array.newInstance(r.getClass(), 1).getClass()); // Obtains an array class of a lambda base type.
    }

    public static class NestedClass {}

    public class InnerClass {}

    public static void main(String[] args) {
        class LocalClass {}
        showClass(void.class);
        showClass(int.class);
        showClass(String.class);
        showClass(Runnable.class);
        showClass(SomeEnum.class);
        showClass(SomeAnnotation.class);
        showClass(int[].class);
        showClass(String[].class);
        showClass(NestedClass.class);
        showClass(InnerClass.class);
        showClass(LocalClass.class);
        showClass(LocalClass[].class);
        Object anonymous = new java.io.Serializable() {};
        showClass(anonymous.getClass());
        showClass(java.lang.reflect.Array.newInstance(anonymous.getClass(), 1).getClass()); // Obtains an array class of an anonymous base type.
        x(() -> {});
    }
}

enum SomeEnum {
   BLUE, YELLOW, RED;
}

@interface SomeAnnotation {}

This is the full output:

getName():          void
getCanonicalName(): void
getSimpleName():    void
toString():         void

getName():          int
getCanonicalName(): int
getSimpleName():    int
toString():         int

getName():          java.lang.String
getCanonicalName(): java.lang.String
getSimpleName():    String
toString():         class java.lang.String

getName():          java.lang.Runnable
getCanonicalName(): java.lang.Runnable
getSimpleName():    Runnable
toString():         interface java.lang.Runnable

getName():          com.example.SomeEnum
getCanonicalName(): com.example.SomeEnum
getSimpleName():    SomeEnum
toString():         class com.example.SomeEnum

getName():          com.example.SomeAnnotation
getCanonicalName(): com.example.SomeAnnotation
getSimpleName():    SomeAnnotation
toString():         interface com.example.SomeAnnotation

getName():          [I
getCanonicalName(): int[]
getSimpleName():    int[]
toString():         class [I

getName():          [Ljava.lang.String;
getCanonicalName(): java.lang.String[]
getSimpleName():    String[]
toString():         class [Ljava.lang.String;

getName():          com.example.TestClassNames$NestedClass
getCanonicalName(): com.example.TestClassNames.NestedClass
getSimpleName():    NestedClass
toString():         class com.example.TestClassNames$NestedClass

getName():          com.example.TestClassNames$InnerClass
getCanonicalName(): com.example.TestClassNames.InnerClass
getSimpleName():    InnerClass
toString():         class com.example.TestClassNames$InnerClass

getName():          com.example.TestClassNames$1LocalClass
getCanonicalName(): null
getSimpleName():    LocalClass
toString():         class com.example.TestClassNames$1LocalClass

getName():          [Lcom.example.TestClassNames$1LocalClass;
getCanonicalName(): null
getSimpleName():    LocalClass[]
toString():         class [Lcom.example.TestClassNames$1LocalClass;

getName():          com.example.TestClassNames$1
getCanonicalName(): null
getSimpleName():    
toString():         class com.example.TestClassNames$1

getName():          [Lcom.example.TestClassNames$1;
getCanonicalName(): null
getSimpleName():    []
toString():         class [Lcom.example.TestClassNames$1;

getName():          com.example.TestClassNames$$Lambda$1/1175962212
getCanonicalName(): com.example.TestClassNames$$Lambda$1/1175962212
getSimpleName():    TestClassNames$$Lambda$1/1175962212
toString():         class com.example.TestClassNames$$Lambda$1/1175962212

getName():          [Lcom.example.TestClassNames$$Lambda$1;
getCanonicalName(): com.example.TestClassNames$$Lambda$1/1175962212[]
getSimpleName():    TestClassNames$$Lambda$1/1175962212[]
toString():         class [Lcom.example.TestClassNames$$Lambda$1;

So, here are the rules. First, lets start with primitive types and void:

If the class object represents a primitive type or void, all the four methods simply returns its name.

Now the rules for the getName() method:

Every non-lambda and non-array class or interface (i.e, top-level, nested, inner, local and anonymous) has a name (which is returned by getName()) that is the package name followed by a dot (if there is a package), followed by the name of its class-file as generated by the compiler (whithout the suffix .class). If there is no package, it is simply the name of the class-file. If the class is an inner, nested, local or anonymous class, the compiler should generate at least one $ in its class-file name. Note that for anonymous classes, the class name would end with a dollar-sign followed by a number.
Lambda class names are generally unpredictable, and you shouldn't care about they anyway. Exactly, their name is the name of the enclosing class, followed by $$Lambda$, followed by a number, followed by a slash, followed by another number.
The class descriptor of the primitives are Z for boolean, B for byte, S for short, C for char, I for int, J for long, F for float and D for double. For non-array classes and interfaces the class descriptor is L followed by what is given by getName() followed by ;. For array classes, the class descriptor is [ followed by the class descriptor of the component type (which may be itself another array class).
For array classes, the getName() method returns its class descriptor. This rule seems to fail only for array classes whose the component type is a lambda (which possibly is a bug), but hopefully this should not matter anyway because there is no point even on the existence of array classes whose component type is a lambda.

Now, the toString() method:

If the class instance represents an interface (or an annotation, which is a special type of interface), the toString() returns "interface " + getName(). If it is a primitive, it returns simply getName(). If it is something else (a class type, even if it is a pretty weird one), it returns "class " + getName().

The getCanonicalName() method:

For top-level classes and interfaces, the getCanonicalName() method returns just what the getName() method returns.
The getCanonicalName() method returns null for anonymous or local classes and for array classes of those.
For inner and nested classes and interfaces, the getCanonicalName() method returns what the getName() method would replacing the compiler-introduced dollar-signs by dots.
For array classes, the getCanonicalName() method returns null if the canonical name of the component type is null. Otherwise, it returns the canonical name of the component type followed by [].

The getSimpleName() method:

For top-level, nested, inner and local classes, the getSimpleName() returns the name of the class as written in the source file.
For anonymous classes the getSimpleName() returns an empty String.
For lambda classes the getSimpleName() just returns what the getName() would return without the package name. This do not makes much sense and looks like a bug for me, but there is no point in calling getSimpleName() on a lambda class to start with.
For array classes the getSimpleName() method returns the simple name of the component class followed by []. This have the funny/weird side-effect that array classes whose component type is an anonymous class have just [] as their simple names.

In addition to Nick Holt's observations, I ran a few cases for Array data type:

//primitive Array
int demo[] = new int[5];
Class<? extends int[]> clzz = demo.getClass();
System.out.println(clzz.getName());
System.out.println(clzz.getCanonicalName());
System.out.println(clzz.getSimpleName());       

System.out.println();


//Object Array
Integer demo[] = new Integer[5]; 
Class<? extends Integer[]> clzz = demo.getClass();
System.out.println(clzz.getName());
System.out.println(clzz.getCanonicalName());
System.out.println(clzz.getSimpleName());

Above code snippet prints:

[I
int[]
int[]

[Ljava.lang.Integer;
java.lang.Integer[]
Integer[]

I've been confused by the wide range of different naming schemes as well, and was just about to ask and answer my own question on this when I found this question here. I think my findings fit it well enough, and complement what's already here. My focus is looking for documentation on the various terms, and adding some more related terms that might crop up in other places.

Consider the following example:

package a.b;
class C {
  static class D extends C {
  }
  D d;
  D[] ds;
}

The simple name of D is D. That's just the part you wrote when declaring the class. Anonymous classes have no simple name. Class.getSimpleName() returns this name or the empty string. It is possible for the simple name to contain a $ if you write it like this, since $ is a valid part of an identifier as per JLS section 3.8 (even if it is somewhat discouraged).
According to the JLS section 6.7, both a.b.C.D and a.b.C.D.D.D would be fully qualified names, but only a.b.C.D would be the canonical name of D. So every canonical name is a fully qualified name, but the converse is not always true. Class.getCanonicalName() will return the canonical name or null.
Class.getName() is documented to return the binary name, as specified in JLS section 13.1. In this case it returns a.b.C$D for D and [La.b.C$D; for D[].
This answer demonstrates that it is possible for two classes loaded by the same class loader to have the same canonical name but distinct binary names. Neither name is sufficient to reliably deduce the other: if you have the canonical name, you don't know which parts of the name are packages and which are containing classes. If you have the binary name, you don't know which $ were introduced as separators and which were part of some simple name. (The class file stores the binary name of the class itself and its enclosing class, which allows the runtime to make this distinction.)
Anonymous classes and local classes have no fully qualified names but still have a binary name. The same holds for classes nested inside such classes. Every class has a binary name.
Running javap -v -private on a/b/C.class shows that the bytecode refers to the type of d as La/b/C$D; and that of the array ds as [La/b/C$D;. These are called descriptors, and they are specified in JVMS section 4.3.
The class name a/b/C$D used in both of these descriptors is what you get by replacing . by / in the binary name. The JVM spec apparently calls this the internal form of the binary name. JVMS section 4.2.1 describes it, and states that the difference from the binary name were for historical reasons.
The file name of a class in one of the typical filename-based class loaders is what you get if you interpret the / in the internal form of the binary name as a directory separator, and append the file name extension .class to it. It's resolved relative to the class path used by the class loader in question.

What is the difference between canonical name, simple name and class name in Java Class?

Related

Recent Posts