How to check if System.Net.WebClient.DownloadData is downloading a binary file?
I am trying to use WebClient
to download a file from web using a WinForms application. However, I really only want to download HTML file. Any other type I will want to ignore.
I checked the WebResponse.ContentType
, but its value is always null
.
Anyone have any idea what could be the cause?
Solution 1:
Given your update, you can do this by changing the .Method in GetWebRequest:
using System;
using System.Net;
static class Program
{
static void Main()
{
using (MyClient client = new MyClient())
{
client.HeadOnly = true;
string uri = "http://www.google.com";
byte[] body = client.DownloadData(uri); // note should be 0-length
string type = client.ResponseHeaders["content-type"];
client.HeadOnly = false;
// check 'tis not binary... we'll use text/, but could
// check for text/html
if (type.StartsWith(@"text/"))
{
string text = client.DownloadString(uri);
Console.WriteLine(text);
}
}
}
}
class MyClient : WebClient
{
public bool HeadOnly { get; set; }
protected override WebRequest GetWebRequest(Uri address)
{
WebRequest req = base.GetWebRequest(address);
if (HeadOnly && req.Method == "GET")
{
req.Method = "HEAD";
}
return req;
}
}
Alternatively, you can check the header when overriding GetWebRespons(), perhaps throwing an exception if it isn't what you wanted:
protected override WebResponse GetWebResponse(WebRequest request)
{
WebResponse resp = base.GetWebResponse(request);
string type = resp.Headers["content-type"];
// do something with type
return resp;
}
Solution 2:
I'm not sure the cause, but perhaps you hadn't downloaded anything yet. This is the lazy way to get the content type of a remote file/page (I haven't checked if this is efficient on the wire. For all I know, it may download huge chunks of content)
Stream connection = new MemoryStream(""); // Just a placeholder
WebClient wc = new WebClient();
string contentType;
try
{
connection = wc.OpenRead(current.Url);
contentType = wc.ResponseHeaders["content-type"];
}
catch (Exception)
{
// 404 or what have you
}
finally
{
connection.Close();
}