check XML syntax with xmllint
I am having a problem with some XML print files where the source system omits to convert some characters to their XML syntax equivalent (e.g. & is not converted to &
).
Is there a way to catch this with xmllint? (I don't need to check the general tree structure using XSD).
xmllint --noout your_test_file.xml
Check for return code of this command. See documentation. Value of 1
is returned when basic parsing errors are met. E.g.:
echo $?
xmllint --valid --encode utf-8 TEST.xml
will validate and output TEST.xml in utf-8
cat TEST.xml
<xml version="1.0" encoding="utf-8"?>
<!DOCTYPE JM SYSTEM "mydtd">
<JM> . . . </JM>
I want to escalate @nathan-basanese's comment as the actual best answer to the OP's question:
// , An easy way to check the return code follows:
$ xmllint --noout your_test_file.xml; echo $?
. – Nathan Basanese Nov 19 '15 at 0:38
By default xmllint "checks to determine if the document is well-formed". So, xmllint --noout --nonet goodfoo.xml
will be comletely silent with an exit code of 0 for well-formed XML while xmllint --noout --nonet badfoo.xml
will emit an error message for each error and an exit code between 1 and 9 depending on the specific error.
The --nonet option tells xmllint not to fetch DTDs since it sounds like the OP just wants to scan for XML well-formedness.
Here's what I think she was looking for:
xmllint --noout --nonet /path/to/xmlfiles/*.xml 2>&1
That will generate a grep-able list of all errors and an exit code between 1 and 9 if there are any errors.
It will exit silently with an exit code of 0 if there are no errors in any of the scanned files.
If you just need to check validity(correctness) of any xml document using xmllint, here is one more way.
if xmllint --noout /tmp/test.xml > /dev/null 2>&1;
then
echo "correct"
else
echo "incorrect"
fi