How to download file with puppeteer using headless: true?

I spent hours poring through this thread and Stack Overflow yesterday, trying to figure out how to get Puppeteer to download a csv file by clicking a download link in headless mode in an authenticated session. The accepted answer here didn't work in my case because the download does not trigger targetcreated, and the next answer, for whatever reason, did not retain the authenticated session. This article saved the day. In short, fetch. Hopefully this helps someone else out.

const res = await this.page.evaluate(() =>
{
    return fetch('https://example.com/path/to/file.csv', {
        method: 'GET',
        credentials: 'include'
    }).then(r => r.text());
});

This page downloads a csv by creating a comma delimited string and forcing the browser to download it by setting the data type like so

let uri = "data:text/csv;charset=utf-8," + encodeURIComponent(content);
window.open(uri, "Some CSV");

This on chrome opens a new tab.

You can tap into this event and physically download the contents into a file. Not sure if this is the best way but works well.

const browser = await puppeteer.launch({
  headless: true
});
browser.on('targetcreated', async (target) => {
    let s = target.url();
    //the test opens an about:blank to start - ignore this
    if (s == 'about:blank') {
        return;
    }
    //unencode the characters after removing the content type
    s = s.replace("data:text/csv;charset=utf-8,", "");
    //clean up string by unencoding the %xx
    ...
    fs.writeFile("/tmp/download.csv", s, function(err) {
        if(err) {
            console.log(err);
            return;
        }
        console.log("The file was saved!");
    }); 
});

const page = await browser.newPage();
.. open link ...
.. click on download link ..

The problem is that the browser closes before download finished.

You can get the filesize and the name of the file from the response, and then use a watch script to check file size from downloaded file, in order to close the browser.

This is an example:

    const filename = "set this with some regex in response";
    const dir = "watch folder or file";
    
    // Download and wait for download
        await Promise.all([
            page.click('#DownloadFile'),
           // Event on all responses
            page.on('response', response => {
                // If response has a file on it
                if (response._headers['content-disposition'] === `attachment;filename=${filename}`) {
                   // Get the size
                    console.log('Size del header: ', response._headers['content-length']);
                    // Watch event on download folder or file
                     fs.watchFile(dir, function (curr, prev) {
                       // If current size eq to size from response then close
                        if (parseInt(curr.size) === parseInt(response._headers['content-length'])) {
                            browser.close();
                            this.close();
                        }
                    });
                }
            })
        ]);

Even that the way of searching in response can be improved though I hope you'll find this useful.


I found a way to wait for browser capability to download a file. The idea is to wait for response with predicate. In my case URL ends with '/data'.

I just didn't like to load file contents into buffer.

await page._client.send('Page.setDownloadBehavior', {
    behavior: 'allow',
    downloadPath: download_path,
});

await frame.focus(report_download_selector);
await Promise.all([
    page.waitForResponse(r => r.url().endsWith('/data')),
    page.keyboard.press('Enter'),
]);

I needed to download a file from behind a login, which was being handled by Puppeteer. targetcreated was not being triggered. In the end I downloaded with request, after copying the cookies over from the Puppeteer instance.

In this case, I'm streaming the file through, but you could just as easily save it.

    res.writeHead(200, {
        "Content-Type": 'application/octet-stream',
        "Content-Disposition": `attachment; filename=secretfile.jpg`
    });
    let cookies = await page.cookies();
    let jar = request.jar();
    for (let cookie of cookies) {
        jar.setCookie(`${cookie.name}=${cookie.value}`, "http://secretsite.com");
    }
    try {
        var response = await request({ url: "http://secretsite.com/secretfile.jpg", jar }).pipe(res);
    } catch(err) {
        console.trace(err);
        return res.send({ status: "error", message: err });
    }