Clojure XML Parsing
I can not find any info on how to parse xml documents and access elements.
I have found two ways to parse the xml document
(clojure.zip/xml-zip (clojure.xml/parse file))
and
(parse-seq file)
but i can seem to find any info on how to process the resulting structure?
Source file's refers to zip-query.clj on how to query the result but that seems to missing too.
Suppose you have the following xml to parse in your file:
<high-node>
<low-node>my text</low-node>
</high-node>
you load clojure.xml
:
user=> (use 'clojure.xml)
when parsed, the xml will have the following structure:
{:tag :high-node, :attrs nil, :content [{:tag :low-node, :attrs nil, :content ["my text"]}]}
and then you can seq over the content of the file to get the content of the low-node
:
user=> (for [x (xml-seq
(parse (java.io.File. file)))
:when (= :low-node (:tag x))]
(first (:content x)))
("my text")
Similarly, if you wanted to have access to the entire list of information on low-node, you would change the :when
predicate to (= (:high-node (:tag x)))
:
user=> (for [x (xml-seq
(parse (java.io.File. file)))
:when (= :high-node (:tag x))]
(first (:content x)))
({:tag :low-node, :attrs nil, :content ["my text"]})
This works because the keywords can operate as functions. See Questions about lists and other stuff in Clojure and Data Structures: Keywords
The above answer works, but I find it a lot easier to use clojure.data.zip.xml
(used to be clojure-contrib.zip-filter.xml
prior to Clojure 1.3).
file:
myfile.xml
:
<songs>
<track id="t1"><name>Track one</name></track>
<track id="t2"><name>Track two</name></track>
</songs>
code:
; Clojure 1.3
(ns example
(:use [clojure.data.zip.xml :only (attr text xml->)]) ; dep: see below
(:require [clojure.xml :as xml]
[clojure.zip :as zip]))
(def xml (xml/parse "myfile.xml"))
(def zipped (zip/xml-zip xml))
(xml-> zipped :track :name text) ; ("Track one" "Track two")
(xml-> zipped :track (attr :id)) ; ("t1" "t2")
Unfortunately, you need to pull in a dependency on data.zip to get this nice read/filter functionality. It's worth the dependency :) In lein it would be (as of 17-Aug-2013):
[org.clojure/data.zip "0.1.1"]
And as for docs for data.zip.xml
... I just look at the relatively small source file here to see what is possible. Another good SO answer here, too.