Fastest way to add new node to end of an xml?
I have a large xml file (approx. 10 MB) in following simple structure:
<Errors>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
<Error>.......</Error>
</Errors>
My need is to write add a new node <Error> at the end before the </Errors> tag. Whats is the fastest way to achieve this in .net?
Solution 1:
You need to use the XML inclusion technique.
Your error.xml (doesn't change, just a stub. Used by XML parsers to read):
<?xml version="1.0"?>
<!DOCTYPE logfile [
<!ENTITY logrows
SYSTEM "errorrows.txt">
]>
<Errors>
&logrows;
</Errors>
Your errorrows.txt file (changes, the xml parser doesn't understand it):
<Error>....</Error>
<Error>....</Error>
<Error>....</Error>
Then, to add an entry to errorrows.txt:
using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
XmlTextWriter xtw = new XmlTextWriter(sw);
xtw.WriteStartElement("Error");
// ... write error messge here
xtw.Close();
}
Or you can even use .NET 3.5 XElement, and append the text to the StreamWriter
:
using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
XElement element = new XElement("Error");
// ... write error messge here
sw.WriteLine(element.ToString());
}
See also Microsoft's article Efficient Techniques for Modifying Large XML Files
Solution 2:
First, I would disqualify System.Xml.XmlDocument because it is a DOM which requires parsing and building the entire tree in memory before it can be appended to. This means your 10 MB of text will be more than 10 MB in memory. This means it is "memory intensive" and "time consuming".
Second, I would disqualify System.Xml.XmlReader because it requires parsing the entire file first before you can get to the point of when you can append to it. You would have to copy the XmlReader into an XmlWriter since you can't modify it. This requires duplicating your XML in memory first before you can append to it.
The faster solution to XmlDocument and XmlReader would be string manipulation (which has its own memory issues):
string xml = @"<Errors><error />...<error /></Errors>";
int idx = xml.LastIndexOf("</Errors>");
xml = xml.Substring(0, idx) + "<error>new error</error></Errors>";
Chop off the end tag, add in the new error, and add the end tag back.
I suppose you could go crazy with this and truncate your file by 9 characters and append to it. Wouldn't have to read in the file and would let the OS optimize page loading (only would have to load in the last block or something).
System.IO.FileStream fs = System.IO.File.Open("log.xml", System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite);
fs.Seek(-("</Errors>".Length), System.IO.SeekOrigin.End);
fs.Write("<error>new error</error></Errors>");
fs.Close();
That will hit a problem if your file is empty or contains only "<Errors></Errors>", both of which can easily be handled by checking the length.
Solution 3:
The fastest way would probably be a direct file access.
using (StreamWriter file = File.AppendText("my.log"))
{
file.BaseStream.Seek(-"</Errors>".Length, SeekOrigin.End);
file.Write(" <Error>New error message.</Error></Errors>");
}
But you lose all the nice XML features and may easily corrupt the file.