How to download website offline with authenticated username and password?

I have an account of this tutorial website testdriven.io, and I would like to download the tutorial offline for my team member to lean without having to login the credential.

So, I tried several ways without success.

First, I logined account and start download as wget -r --mirror -p --convert-links -P . https://testdriven.io/courses/. However, the result was an offline website without login account and tutorial was limitted accordingly.

Second, I tried to pass the parameter string as this

wget --save-cookies cookies.txt \
     --keep-session-cookies \
     --post-data '[email protected]&password=z9vi2gE82lO@sTN' \
     --delete-after \
     https://testdriven.io/courses/

Yet, it returned

--2019-12-18 02:01:22--  https://testdriven.io/courses/
Resolving testdriven.io (testdriven.io)... 104.27.143.239, 104.27.142.239, 2606:4700:30::681b:8eef, ...
Connecting to testdriven.io (testdriven.io)|104.27.143.239|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-12-18 02:01:23 ERROR 403: Forbidden.

Thus, how can I manage to download full offline tutorial with providing authenticated username and password? Thanks.


Solution 1:

The website will store your auth information in a cookie.

You can find this in your browser's network inspector. Look under the request headers and grab the cookies for use with wget.

web inspector

You will need to pass the cookie into wget, and theoretically maintain a cookie jar as well using --save-cookies and --load-cookies.

For example:

wget -r --mirror -p --convert-links -P . \
  --header="Cookie: __cfduid=ddebc00435655a6a20430c65436f729851576611229; csrftoken=6QuufXScgoQkyEe18dAL9YmqhxlyJpegNtyMCr4LgAUuvBs3KUzQwqEYBvWZV4yg; sessionid=c5gbfxkhqwpblxlhatgfh3wtfgy0zgpp" \
  --save-cookies cookies.txt \
  --load-cookies cookies.txt \
  --accept-regex '/courses/' \
  https://testdriven.io/courses/auth-flask-react/

Solution 2:

Read man wget, especially the part that says:

 --user=user
 --password=password
     Specify the username user and password password for both FTP and HTTP file retrieval.  These parameters can be
     overridden using the --ftp-user and --ftp-password options for FTP connections and the --http-user and --http-password
     options for HTTP connections.

Read about all the wget options. Would this help?:

--metalink-over-http
     Issues HTTP HEAD request instead of GET and extracts Metalink metadata from response headers. Then it switches to
     Metalink download.  If no valid Metalink metadata is found, it falls back to ordinary HTTP download.