Determine file type of an image
I'm downloading some images from a service that doesn't always include a content-type and doesn't provide an extension for the file I'm downloading (ugh, don't ask).
What's the best way to determine the image format in .NET?
The application that is reading these downloaded images needs to have a proper file extension or all hell breaks loose.
Solution 1:
A probably easier approach would be to use Image.FromFile() and then use the RawFormat property, as it already knows about the magic bits in the headers for the most common formats, like this:
Image i = Image.FromFile("c:\\foo");
if (System.Drawing.Imaging.ImageFormat.Jpeg.Equals(i.RawFormat))
MessageBox.Show("JPEG");
else if (System.Drawing.Imaging.ImageFormat.Gif.Equals(i.RawFormat))
MessageBox.Show("GIF");
//Same for the rest of the formats
Solution 2:
You can use code below without reference of System.Drawing and unnecessary creation of object Image. Also you can use Alex solution even without stream and reference of System.IO.
public enum ImageFormat
{
bmp,
jpeg,
gif,
tiff,
png,
unknown
}
public static ImageFormat GetImageFormat(Stream stream)
{
// see http://www.mikekunz.com/image_file_header.html
var bmp = Encoding.ASCII.GetBytes("BM"); // BMP
var gif = Encoding.ASCII.GetBytes("GIF"); // GIF
var png = new byte[] { 137, 80, 78, 71 }; // PNG
var tiff = new byte[] { 73, 73, 42 }; // TIFF
var tiff2 = new byte[] { 77, 77, 42 }; // TIFF
var jpeg = new byte[] { 255, 216, 255, 224 }; // jpeg
var jpeg2 = new byte[] { 255, 216, 255, 225 }; // jpeg canon
var buffer = new byte[4];
stream.Read(buffer, 0, buffer.Length);
if (bmp.SequenceEqual(buffer.Take(bmp.Length)))
return ImageFormat.bmp;
if (gif.SequenceEqual(buffer.Take(gif.Length)))
return ImageFormat.gif;
if (png.SequenceEqual(buffer.Take(png.Length)))
return ImageFormat.png;
if (tiff.SequenceEqual(buffer.Take(tiff.Length)))
return ImageFormat.tiff;
if (tiff2.SequenceEqual(buffer.Take(tiff2.Length)))
return ImageFormat.tiff;
if (jpeg.SequenceEqual(buffer.Take(jpeg.Length)))
return ImageFormat.jpeg;
if (jpeg2.SequenceEqual(buffer.Take(jpeg2.Length)))
return ImageFormat.jpeg;
return ImageFormat.unknown;
}
Solution 3:
All the image formats set their initial bytes to a particular value:
- JPG: 0xFF 0xD8
- PNG: 0x89 0x50 0x4E 0x47 0x0D 0x0A 0x1A 0x0A
- GIF: 'G' 'I' 'F'
Search for "jpg file format" replacing jpg with the other file formats you need to identify.
As Garth recommends, there is a database of such 'magic numbers' showing the file type of many files. If you have to detect a lot of different file types it's worthwhile looking through it to find the information you need. If you do need to extend this to cover many, many file types, look at the associated file command which implements the engine to use the database correctly (it's non trivial for many file formats, and is almost a statistical process)
-Adam
Solution 4:
Adam is pointing in exactly the right direction.
If you want to find out how to sense almost any file, look at the database behind the file
command on a UNIX, Linux, or Mac OS X machine.
file
uses a database of “magic numbers” — those initial bytes Adam listed — to sense a file's type. man file
will tell you where to find the database on your machine, e.g. /usr/share/file/magic
. man magic
will tell you its format.
You can either write your own detection code based on what you see in the database, use pre-packaged libraries (e.g. python-magic), or — if you're really adventurous — implement a .NET version of libmagic
. I couldn't find one, and hope another member can point one out.
In case you don't have a UNIX machine handy, the database looks like this:
# PNG [Portable Network Graphics, or "PNG's Not GIF"] images # (Greg Roelofs, [email protected]) # (Albert Cahalan, [email protected]) # # 137 P N G \r \n ^Z \n [4-byte length] H E A D [HEAD data] [HEAD crc] ... # 0 string \x89PNG PNG image data, >4 belong !0x0d0a1a0a CORRUPTED, >4 belong 0x0d0a1a0a >>16 belong x %ld x >>20 belong x %ld, >>24 byte x %d-bit >>25 byte 0 grayscale, >>25 byte 2 \b/color RGB, >>25 byte 3 colormap, >>25 byte 4 gray+alpha, >>25 byte 6 \b/color RGBA, #>>26 byte 0 deflate/32K, >>28 byte 0 non-interlaced >>28 byte 1 interlaced 1 string PNG PNG image data, CORRUPTED # GIF 0 string GIF8 GIF image data >4 string 7a \b, version 8%s, >4 string 9a \b, version 8%s, >6 leshort >0 %hd x >8 leshort >0 %hd #>10 byte &0x80 color mapped, #>10 byte&0x07 =0x00 2 colors #>10 byte&0x07 =0x01 4 colors #>10 byte&0x07 =0x02 8 colors #>10 byte&0x07 =0x03 16 colors #>10 byte&0x07 =0x04 32 colors #>10 byte&0x07 =0x05 64 colors #>10 byte&0x07 =0x06 128 colors #>10 byte&0x07 =0x07 256 colors
Good luck!