xmllint failing to properly query with xpath
I'm trying to query an xml file generated by adium. xmlwf says that it's well formed. By using xmllint's debug option i get the following:
$ xmllint --debug doc.xml
DOCUMENT
version=1.0
encoding=UTF-8
URL=doc.xml
standalone=true
ELEMENT chat
default namespace href=http://purl.org/net/ulf/ns/0.4-02
ATTRIBUTE account
TEXT
[email protected]
ATTRIBUTE service
TEXT compact
content=MSN
TEXT compact
content=
ELEMENT event
ATTRIBUTE type
Everything seems to parse just fine. However, when I try to query even the simplest things, I don't get anything:
$ xmllint --xpath '/chat' doc.xml
XPath set is empty
What's happening? Running that exact same query using xpath returns the correct results (however with no newline between results). Am I doing something wrong or is xmllint just not working properly?
Here's a shorter, anonymized version of the xml that shows the same behavior:
<?xml version="1.0" encoding="UTF-8" ?>
<chat xmlns="http://purl.org/net/ulf/ns/0.4-02" account="[email protected]" service="MSN">
<event type="windowOpened" sender="[email protected]" time="2011-11-22T00:34:43-03:00"></event>
<message sender="[email protected]" time="2011-11-22T00:34:43-03:00" alias="foo"><div><span style="color: #000000; font-family: Helvetica; font-size: 12pt;">hi</span></div></message>
</chat>
Solution 1:
I don't use xmllint, but I think the reason your XPath isn't working is because your doc.xml file is using a default namespace (http://purl.org/net/ulf/ns/0.4-02
).
From what I can see, you have 2 options.
A. Use xmllint in shell mode and declare the namespace with a prefix. You can then use that prefix in your XPath.
xmllint --shell doc.xml
/ > setns x=http://purl.org/net/ulf/ns/0.4-02
/ > xpath /x:chat
B. Use local-name()
to match element names.
xmllint --xpath /*[local-name()='chat']
You may also want to use namespace-uri()='http://purl.org/net/ulf/ns/0.4-02'
along with local-name()
so you are sure to return exactly what you are intending to return.
Solution 2:
I realize this question is very old now, but in case it helps someone...
Had the same problem and it was due to the XML having a namespace (and sometimes it was duplicated in various places in the XML). Found it easiest to just remove the namespace before using xmllint:
sed -e 's/xmlns="[^"]*"//g' file.xml | xmllint --xpath "..." -
In my case the XML was UTF-16 so I had to convert to UTF-8 first (for sed):
iconv -f utf16 -t utf8 file.xml | sed -e 's/encoding="UTF-16"?>/encoding="UTF-8"?>/' | sed -e 's/xmlns="[^"]*"//g' | xmllint --xpath "..." -