Best way to read a large file into a byte array in C#?
I have a web server which will read large binary files (several megabytes) into byte arrays. The server could be reading several files at the same time (different page requests), so I am looking for the most optimized way for doing this without taxing the CPU too much. Is the code below good enough?
public byte[] FileToByteArray(string fileName)
{
byte[] buff = null;
FileStream fs = new FileStream(fileName,
FileMode.Open,
FileAccess.Read);
BinaryReader br = new BinaryReader(fs);
long numBytes = new FileInfo(fileName).Length;
buff = br.ReadBytes((int) numBytes);
return buff;
}
Simply replace the whole thing with:
return File.ReadAllBytes(fileName);
However, if you are concerned about the memory consumption, you should not read the whole file into memory all at once at all. You should do that in chunks.
I might argue that the answer here generally is "don't". Unless you absolutely need all the data at once, consider using a Stream
-based API (or some variant of reader / iterator). That is especially important when you have multiple parallel operations (as suggested by the question) to minimise system load and maximise throughput.
For example, if you are streaming data to a caller:
Stream dest = ...
using(Stream source = File.OpenRead(path)) {
byte[] buffer = new byte[2048];
int bytesRead;
while((bytesRead = source.Read(buffer, 0, buffer.Length)) > 0) {
dest.Write(buffer, 0, bytesRead);
}
}
I would think this:
byte[] file = System.IO.File.ReadAllBytes(fileName);
Your code can be factored to this (in lieu of File.ReadAllBytes):
public byte[] ReadAllBytes(string fileName)
{
byte[] buffer = null;
using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
buffer = new byte[fs.Length];
fs.Read(buffer, 0, (int)fs.Length);
}
return buffer;
}
Note the Integer.MaxValue - file size limitation placed by the Read method. In other words you can only read a 2GB chunk at once.
Also note that the last argument to the FileStream is a buffer size.
I would also suggest reading about FileStream and BufferedStream.
As always a simple sample program to profile which is fastest will be most beneficial.
Also your underlying hardware will have a large effect on performance. Are you using server based hard disk drives with large caches and a RAID card with onboard memory cache? Or are you using a standard drive connected to the IDE port?