PDF bleed detection
Solution 1:
Quoting from the PDF specification ISO 32000-1:2008 as published by Adobe:
14.11.2 Page Boundaries
14.11.2.1 General
A PDF page may be prepared either for a finished medium, such as a sheet of paper, or as part of a prepress process in which the content of the page is placed on an intermediate medium, such as film or an imposed reproduction plate. In the latter case, it is important to distinguish between the intermediate page and the finished page. The intermediate page may often include additional production-related content, such as bleeds or printer marks, that falls outside the boundaries of the finished page. To handle such cases, a PDF page maydefine as many as five separate boundaries to control various aspects of the imaging process:
The media box defines the boundaries of the physical medium on which the page is to be printed. It may include any extended area surrounding the finished page for bleed, printing marks, or other such purposes. It may also include areas close to the edges of the medium that cannot be marked because of physical limitations of the output device. Content falling outside this boundary may safely be discarded without affecting the meaning of the PDF file.
The crop box defines the region to which the contents of the page shall be clipped (cropped) when displayed or printed. Unlike the other boxes, the crop box has no defined meaning in terms of physical page geometry or intended use; it merely imposes clipping on the page contents. However, in the absence of additional information (such as imposition instructions specified in a JDF or PJTF job ticket), the crop box determines how the page’s contents shall be positioned on the output medium. The default value is the page’s media box.
The bleed box (PDF 1.3) defines the region to which the contents of the page shall be clipped when output in a production environment. This may include any extra bleed area needed to accommodate the physical limitations of cutting, folding, and trimming equipment. The actual printed page may include printing marks that fall outside the bleed box. The default value is the page’s crop box.
The trim box (PDF 1.3) defines the intended dimensions of the finished page after trimming. It may be smaller than the media box to allow for production-related content, such as printing instructions, cut marks, or colour bars. The default value is the page’s crop box.
The art box (PDF 1.3) defines the extent of the page’s meaningful content (including potential white space) as intended by the page’s creator. The default value is the page’s crop box.
The page object dictionary specifies these boundaries in the MediaBox, CropBox, BleedBox, TrimBox, and ArtBox entries, respectively (see Table 30). All of them are rectangles expressed in default user space units. The crop, bleed, trim, and art boxes shall not ordinarily extend beyond the boundaries of the media box. If they do, they are effectively reduced to their intersection with the media box. Figure 86 illustrates the relationships among these boundaries. (The crop box is not shown in the figure because it has no defined relationship with any of the other boundaries.)
Following that there is a nice graphic showing those boxes in relation to each other:
The reasons why in many cases only the media box is set, are
that in case of PDFs meant for electronic consumption (i.e. reading on a computer) the other boxes hardly matter; and
that even in the prepress context they aren't as necessary anymore as they used to be, cf. the article Pedro refers to in his comment.
Concerning your "bonus question": The user space unit is 1⁄72 inch by default; since PDF 1.6 it can be changed, though, to any (not necessary integer) multiple of that size using the UserUnit entry in the page dictionary. Changing it in an existing PDF essentially scales it as the user space unit is the basic unit in the device independent coordinate system of a page. Therefore, unless you want to update each and every command in the page descriptions refering to coordinates to keep the page dimensions, you won't want to enforce a millimeter user space unit... ;)