Where are git-lfs files stored?
I am trying to figure out how to use git-lfs. I use a gitlab EE server.
Maybe I missed something, but I failed to find any documentation on git-lfs beyond very short tutorial introducing the "track" command and cute 1 minute videos.
For example, I add and track a 3.7GB tar file in a repo, and push it:
git lfs track "*.tar"
cp <a folder>/a.tar .
git add a.tar
git commit -m "add a.tar"
git push origin master
Question 1: at the end of this process, has a.tar been uploaded on the gitlab server ? It is unclear as the "add" and the "commit" commands took some time (maybe not long enough to let me wonder if the 3.7GB were uploaded during that time) but the push did not take any time at all (a fraction of second).
Question 2: if the file was uploaded on the server, where ? Obviously not in the same place as the repo (that is the point). I asked because my server is being backed-up, and I need to know if the use of git-lfs requires me to update this in any way.
Question 3: if the file was not uploaded, does this mean other users of the repo will get the link to the file on the original machine on which the file was added ? Is there a way to change this to a location on the server ? (back to question 2)
Question 4: after cloning the repo, indeed the full 3.4G file is not there, "just" a text file with the content:
version https://git-lfs.github.com/spec/v1
oid sha256:4bd049d85f06029d28bd94eae6da2b6eb69c0b2d25bac8c30ac1b156672c4082
size 3771098624
This is of course awesome and the whole point. But what if access to the full file is required ? how to download it ?
I would be happy with either direct answer to this question or a link toward a proper documentation.
Solution 1:
Answer 1
As explained in this video (at 1:27), when you push a file tracked by git lfs
it is intercepted and placed on a different server, leaving a pointer in your git repository. As you see in the reference you provide in Question 4, this worked for you.
Answer 2
This is a bit more tricky. Reading the documentation for git lfs smudge
, we have:
Read a Git LFS pointer file from standard input and write the contents of the corresponding large file to standard output. If needed, download the file's contents from the Git LFS endpoint. The argument, if provided, is only used for a progress bar.
The git lfs
endpoint can be found from the output of git lfs env
. My "endpoint" is a folder under (but not in) my repository, which makes me think that GitLab creates a git repository on the server in our account space to store binary files.
That said, I don't know how you'd go about backing this up. GitHub provides a git lfs
server that's "not in a production ready state," so it'd require some work on your part to set it up such that your binary files are uploaded to a server you administer. If backing up these files is a priority and you don't want to use one of the implementations (Amazon S3, etc), you might try another binary file storage system that works with git, such as git-media
, git-annex
, git-fat
, git-bigstore
.... I haven't looked into these options in depth, so couldn't make a recommendation.
Answer 3
If the file was not uploaded using git lfs
it would have been pushed using git
and you'd have a binary file in your git repository. But, yours was uploaded using git lfs
as you say in Question 4.
Answer 4
Other users of your repository, after having installed git lfs
on their local machines, can simply type git lfs pull
to bring in the binary file(s) that you pushed using git lfs
.