Download and save PDF file with Python requests module

You should use response.content in this case:

with open('/tmp/metadata.pdf', 'wb') as f:
    f.write(response.content)

From the document:

You can also access the response body as bytes, for non-text requests:
>>> r.content
b'[{"repository":{"open_issues":0,"url":"https://github.com/...

So that means: response.text return the output as a string object, use it when you're downloading a text file. Such as HTML file, etc.

And response.content return the output as bytes object, use it when you're downloading a binary file. Such as PDF file, audio file, image, etc.

You can also use response.raw instead. However, use it when the file which you're about to download is large. Below is a basic example which you can also find in the document:

import requests

url = 'http://www.hrecos.org//images/Data/forweb/HRTVBSH.Metadata.pdf'
r = requests.get(url, stream=True)

with open('/tmp/metadata.pdf', 'wb') as fd:
    for chunk in r.iter_content(chunk_size):
        fd.write(chunk)

chunk_size is the chunk size which you want to use. If you set it as 2000, then requests will download that file the first 2000 bytes, write them into the file, and do this again, again and again, unless it finished.

So this can save your RAM. But I'd prefer use response.content instead in this case since your file is small. As you can see use response.raw is complex.

Relates:

How to download large file in python with requests.py?
How to download image using requests

In Python 3, I find pathlib is the easiest way to do this. Request's response.content marries up nicely with pathlib's write_bytes.

from pathlib import Path
import requests
filename = Path('metadata.pdf')
url = 'http://www.hrecos.org//images/Data/forweb/HRTVBSH.Metadata.pdf'
response = requests.get(url)
filename.write_bytes(response.content)

You can use urllib:

import urllib.request
urllib.request.urlretrieve(url, "filename.pdf")

How to update a claim in ASP.NET Identity?

In Windows 10, File explorer, how to make the left navigation pane stop following whatever folder we are in?

Ransomware hack usb controller [duplicate]

Using monitor in UHD resolution, but keeping all Windows UI approx. the same visual size as it was in lower resolution?

What diagnosis steps can I do if my emails send, but are not received, not even as spam?

Make Outlook run rules on non-inbox folders automatically

Why does git complain that no GPG agent is running?

Extending Laptop display to two monitors

Error 1075: The dependency service does not exist or has been marked for deletion

external displays receive input signals ONLY when I open a Powerpoint file and use slide show mode?

MySQL Workbench unable to restore workspace

Switch to this tab shortcut in Chrome

Download and save PDF file with Python requests module

Related

Recent Posts