How do I use wget/curl to download from a site I am logged into?

Some parts of wikipedia appear differently when you're logged in. I would like to wget user pages so they would appear as if I was logged in.

Is there a way I can wget user pages like this

http://en.wikipedia.org/wiki/User:A

this is the login page:

http://en.wikipedia.org/w/index.php?title=Special:UserLogin&returnto=Login&campaign=ACP3

The easy way: login with your browser,and give the cookies to wget

Easiest method: in general, you need to provide wget or curl with the (logged-in) cookies from a particular website for them to fetch pages as if you were logged in.

If you are using Firefox, it's easy to do via the Export Cookies add-on. Install the add-on, and:

  1. Go to Tools...Export Cookies, and save the cookies.txt file (you can change the filename/destination).
  2. Open up a terminal, and use wget with the --load-cookies=FILENAME option, e.g.

    wget --load-cookies=cookies.txt http://en.wikipedia.org/wiki/User:A
    
    • For curl, it's curl --cookie cookies.txt ...

(I will try to update this answer for Chrome/Chromium users)

The hard way: use curl (preferably) or wget to manage the entire session

  • A detailed how-to is beyond the scope of this answer, but you use curl with the --cookie-jar or wget with the --save-cookies --keep-session-cookiesoptions, along with the HTTP/S PUT method to log in to a site, save the login cookies, and then use them to simulate a browser.
  • Needless to say, this requires going through the HTML source for the login page (get input field names, etc.), and is often difficult to get to work for sites using anything beyond simple login/password authentication.
  • Tip: if you go this route, it is often much simpler to deal with the mobile version of a website (if available), at least for the authentication step.

Another easy solution that worked for me without installing anything extra:

  • Open "Network" tab of "Web Developer" tool: Ctrl-Shift-E
  • Visit the page you want to save (e.g. a photo behind a login)
  • Right click the request and choose 'Copy'->'Copy as cURL'

This will give you a command that you can paste directly into your shell, that has all your cookie credentials e.g.

curl 'https://mysite.test/my-secure-dir/picture1.jpg' \ 
-H 'User-Agent: Mozilla/5.0 ...' \
-H 'Cookie: SESSIONID=abcdef1234567890'

You can then modify the URL in the command to fetch whatever you want.


With cURL is really easy to handle cookies in both ways.

curl www.target-url.com -c cookie.txt then will save a file named cookie.txt. But you need to log in, so need to use --data with arguments like: curl -X --data "var1=1&var2=2" www.target-url.com/login.php -c cookie.txt. Once you get loggued cookie you can send it with: curl www.target-url.com/?user-page.php -b cookie.txt

Just use -c (--cookie) or -b (--cookie-jar) to save and send.

Note1: Using cURL CLI is a lot of easier than PHP and maybe faster ;)

For save the final content you can easily add > filename.html to your cURL command then save full html code.

Note2 about "full": Yo cannot render javascript with cURL, just get the source code.


For those still interested in this questions, there's a very useful Chrome extension called CurlWGet that allows you to generate a wget / curl request with authentication measures, etc. with one click. To install this extension, follow the steps below:

  1. Install the extension from the Chrome Webstore.
  2. Go the web page that would you like to download.
  3. Start the download.
  4. The extension will generate a link for you.

Enjoy!


The blog post Wget with Firefox Cookies shows how to access the sqlite data file in which Firefox stores its cookies. That way one doesn't need to manually export the cookies for use with wget. A comment suggests that it doesn't work with session cookies, but it worked fine for the sites I tried it with.