Download all files in a path on Jupyter notebook server
As a user in a class that runs Jupyter notebooks for assignments, I have access to the assignments via the web interface. I assume the assignments are stored somewhere in my personal space on the server, and so I should be able to download them. How can I download all files that are in my personal user space? (e.g., wget
)
Here's the path structure:
https://urltoserver/user/username
There are several directories: assignments, data, etc.
https://urltoserver/user/username/assignments
https://urltoserver/user/username/data
...
I want to download all the folders (recursively). Just enough that I can launch whatever I see online locally. If there are some forbidden folders, then ok, skip those and download the rest.
Please specify the command exactly as I couldn't figure it out myself (I tried wget
)
Solution 1:
Try running this as separate cell in one of your notebooks:
!tar chvfz notebook.tar.gz *
If you want to cover more folders up the tree, write ../
before the *
for every step up the directory. The file notebook.tar.gz will be saved in the same folder as your notebook.
Solution 2:
I am taking Prof. Andrew Ng's Deeplearning.ai program via Coursera. The curriculum uses Jupyter Notebooks online. Along with the notebooks are folders with large files. Here's what I used to successfully download all assignments with the associated files and folders to my local Windows 10 PC.
Start with the following line of code as suggested in the post by Serzan Akhmetov above:
!tar cvfz allfiles.tar.gz *
This produces a tarball which, if small enough, can be downloaded from the Jupyter notebook itself and unzipped using 7-Zip. However, this course has individual files of size 100's of MB and folders with 100's of sample images. The resulting tarball is too large to download via browser.
So add one more line of code to split files into manageable chunk sizes as follows:
!split -b 50m allfiles.tar.gz allfiles.tar.gz.part.
This will split the archive into multiple parts each of size 50 Mb (or your preferred size setting). Each part will have an extension like allfiles.tar.gz.part.xx
. Download each part as before.
The final task is to untar the multi-part archive. This is very simple with 7-Zip. Just select the first file in the series for extraction with 7-Zip. This is the file named allfiles.tar.gz.part.aa
for the example used. It will pull all the necessary parts together as long as they are in the same folder.
Hope this helps add to Serzan's excellent answer above.
Solution 3:
You can create a new terminal from the "New" menu and call the command described on https://stackoverflow.com/a/47355754/8554972:
tar cvfz notebook.tar.gz *
The file notebook.tar.gz will be saved in the same folder as your notebook.
Solution 4:
The easiest way is to archive all content using tar, but there is also an API for files downloading.
GET /files/_FILE_PATH_
To get all files in folder you can use:
GET /api/contents/work
Example:
curl https://server/api/contents?token=your_token
curl https://server/files/path/to/file.txt?token=your_token --output some.file
Source: Jupyter Docs
Solution 5:
Try first to get the directory by:
import os
os.getcwd()
And then use snipped from How to create a zip archive of a directory. You can download complete directory by zipping it. Good luck!