What do Public Identifier, System Identifier, and Base system identifier refer to in XML?

Solution 1:

If you look at the XML syntax, you will see, for example that external entity references use the syntax:

ExternalID ::= 'SYSTEM' S SystemLiteral
  | 'PUBLIC' S PubidLiteral S SystemLiteral

Here's an example of this syntax in use:

<!ENTITY open-hatch
         PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
         "http://www.textuality.com/boilerplate/OpenHatch.xml">

References to DTDs work in the same way (in fact, external DTDs are technically-speaking one kind of entity).

The "system identifier" is a URI that identifies where the text of an entity can be found. The "public identifier" (a hangover from SGML) is more like a name for the resource; it only helps you find the resource if you have some kind of index or catalog that tells you where to look.

System identifiers are often given as relative URI references (for example "books.dtd") which need to be resolved relative to a base URI. The base URI is generally the location where the containing resource (or entity) was found. For example, if an XML document is at http://my.com/lib/books.xml then its base URI is http://my.com/lib/ and the relative URI books.dtd is then expanded to http://my.com/lib/books.dtd.

In answer to your question "is there any purpose to the public or system id" the answer is no if the document consists entirely of a single entity (which is often the case). But as soon as multiple entities come into play, you need identifiers to link them together.