How to know if urllib.urlretrieve succeeds?

Solution 1:

Consider using urllib2 if it possible in your case. It is more advanced and easy to use than urllib.

You can detect any HTTP errors easily:

>>> import urllib2
>>> resp = urllib2.urlopen("http://google.com/abc.jpg")
Traceback (most recent call last):
<<MANY LINES SKIPPED>>
urllib2.HTTPError: HTTP Error 404: Not Found

resp is actually HTTPResponse object that you can do a lot of useful things with:

>>> resp = urllib2.urlopen("http://google.com/")
>>> resp.code
200
>>> resp.headers["content-type"]
'text/html; charset=windows-1251'
>>> resp.read()
"<<ACTUAL HTML>>"

Solution 2:

I keep it simple:

# Simple downloading with progress indicator, by Cees Timmerman, 16mar12.

import urllib2

remote = r"http://some.big.file"
local = r"c:\downloads\bigfile.dat"

u = urllib2.urlopen(remote)
h = u.info()
totalSize = int(h["Content-Length"])

print "Downloading %s bytes..." % totalSize,
fp = open(local, 'wb')

blockSize = 8192 #100000 # urllib.urlretrieve uses 8192
count = 0
while True:
    chunk = u.read(blockSize)
    if not chunk: break
    fp.write(chunk)
    count += 1
    if totalSize > 0:
        percent = int(count * blockSize * 100 / totalSize)
        if percent > 100: percent = 100
        print "%2d%%" % percent,
        if percent < 100:
            print "\b\b\b\b\b",  # Erase "NN% "
        else:
            print "Done."

fp.flush()
fp.close()
if not totalSize:
    print

Solution 3:

According to the documentation is is undocumented

to get access to the message it looks like you do something like:

a, b=urllib.urlretrieve('http://google.com/abc.jpg', r'c:\abc.jpg')

b is the message instance

Since I have learned that Python it is always useful to use Python's ability to be introspective when I type

dir(b)

I see lots of methods or functions to play with

And then I started doing things with b

for example

b.items()

Lists lots of interesting things, I suspect that playing around with these things will allow you to get the attribute you want to manipulate.

Sorry this is such a beginner's answer but I am trying to master how to use the introspection abilities to improve my learning and your questions just popped up.

Well I tried something interesting related to this-I was wondering if I could automatically get the output from each of the things that showed up in the directory that did not need parameters so I wrote:

needparam=[]
for each in dir(b):
    x='b.'+each+'()'
    try:
        eval(x)
        print x
    except:
        needparam.append(x)

How to know if urllib.urlretrieve succeeds?

Solution 1:

Solution 2:

Solution 3:

Related

Recent Posts