check XML syntax with xmllint

I am having a problem with some XML print files where the source system omits to convert some characters to their XML syntax equivalent (e.g. & is not converted to &).

Is there a way to catch this with xmllint? (I don't need to check the general tree structure using XSD).


xmllint --noout your_test_file.xml

Check for return code of this command. See documentation. Value of 1 is returned when basic parsing errors are met. E.g.:

echo $?

xmllint --valid --encode utf-8 TEST.xml

will validate and output TEST.xml in utf-8

cat TEST.xml

<xml version="1.0" encoding="utf-8"?>

<!DOCTYPE JM SYSTEM "mydtd">

<JM> . . . </JM>


I want to escalate @nathan-basanese's comment as the actual best answer to the OP's question:

// , An easy way to check the return code follows: $ xmllint --noout your_test_file.xml; echo $?. – Nathan Basanese Nov 19 '15 at 0:38

By default xmllint "checks to determine if the document is well-formed". So, xmllint --noout --nonet goodfoo.xml will be comletely silent with an exit code of 0 for well-formed XML while xmllint --noout --nonet badfoo.xml will emit an error message for each error and an exit code between 1 and 9 depending on the specific error.

The --nonet option tells xmllint not to fetch DTDs since it sounds like the OP just wants to scan for XML well-formedness.

Here's what I think she was looking for:

xmllint --noout --nonet /path/to/xmlfiles/*.xml 2>&1

That will generate a grep-able list of all errors and an exit code between 1 and 9 if there are any errors.

It will exit silently with an exit code of 0 if there are no errors in any of the scanned files.


If you just need to check validity(correctness) of any xml document using xmllint, here is one more way.

if xmllint --noout /tmp/test.xml > /dev/null 2>&1;
then
    echo "correct"
else
    echo "incorrect"
fi