What does tree-ish mean in Git?
I'm very confused about how to use git archive
.
I have a git repository with folder Foo, Bar and Baz at the top level. I need to export folder Foo in a SVN-ish sort of way for quick test deployment.
I learned that I could use git-archive
in an SVN-ish export sort of way.
But here's the thing, The following works fine:
git archive master | tar -x -C ~/destination
it results in Foo, Bar, Baz folders in the destination folder.
However, the following will error out with fatal not a valid object name
:
git archive master/foo | tar -x -C ~/destination
The Documentation
Looking as the synopsis for the git archive
program I see that it can take a <tree-ish> [path]
as a parameter (synopsis summarized to relevant parts):
git archive <tree-ish> [path...]
If master/foo
is not tree-ish
, then what is?
The Short Answer (TL;DR)
"Tree-ish" is a term that refers to any identifier (as specified in the Git revisions documentation) that ultimately leads to a (sub)directory tree (Git refers to directories as "trees" and "tree objects").
In the original poster's case, foo
is a directory that he wants to
specify. The correct way to specify a (sub)directory in Git is to use this
"tree-ish" syntax (item #15 from the Git revisions documentation):
<rev>:<path>
, e.g.HEAD:README
,:README
,master:./README
A suffix
:
followed by a path names the blob or tree at the given path in the tree-ish object named by the part before the colon.
So, in other words, master:foo
is the correct syntax, not master/foo
.
Other "Tree-ish" (Plus Commit-ish)
Here's a complete list of commit-ish and tree-ish identifiers (from the Git revisions documentation, thanks to LopSae for pointing it out):
----------------------------------------------------------------------
| Commit-ish/Tree-ish | Examples
----------------------------------------------------------------------
| 1. <sha1> | dae86e1950b1277e545cee180551750029cfe735
| 2. <describeOutput> | v1.7.4.2-679-g3bee7fb
| 3. <refname> | master, heads/master, refs/heads/master
| 4. <refname>@{<date>} | master@{yesterday}, HEAD@{5 minutes ago}
| 5. <refname>@{<n>} | master@{1}
| 6. @{<n>} | @{1}
| 7. @{-<n>} | @{-1}
| 8. <refname>@{upstream} | master@{upstream}, @{u}
| 9. <rev>^ | HEAD^, v1.5.1^0
| 10. <rev>~<n> | master~3
| 11. <rev>^{<type>} | v0.99.8^{commit}
| 12. <rev>^{} | v0.99.8^{}
| 13. <rev>^{/<text>} | HEAD^{/fix nasty bug}
| 14. :/<text> | :/fix nasty bug
----------------------------------------------------------------------
| Tree-ish only | Examples
----------------------------------------------------------------------
| 15. <rev>:<path> | HEAD:README, :README, master:./README
----------------------------------------------------------------------
| Tree-ish? | Examples
----------------------------------------------------------------------
| 16. :<n>:<path> | :0:README, :README
----------------------------------------------------------------------
Identifiers #1-14 are all "commit-ish", because they all lead to commits, but because commits also point to directory trees, they all ultimately lead to (sub)directory tree objects, and can therefore also be used as "tree-ish".
#15 can also be used as tree-ish when it refers to a (sub)directory, but it can also be used to identify specific files. When it refers to files, I'm not sure if it's still considered "tree-ish", or if acts more like "blob-ish" (Git refers to files as "blobs").
The Long Answer
At its lowest levels, Git keeps track of source code using four fundamental objects:
- Annotated tags, which point to commits.
- Commits, which point to the root directory tree of your project.
- Trees, which are directories and subdirectories.
- Blobs, which are files.
Each of these objects has its own sha1 hash ID, since Linus Torvalds designed Git like an content- addressable filesystem, i.e. files can be retrieved based on their content (sha1 IDs are generated from file content). The Pro Git book gives this example diagram:
Many Git commands can accept special identifiers for commits and (sub)directory trees:
-
"Commit-ish" are identifiers that ultimately lead to a commit object. For example,
tag -> commit
-
"Tree-ish" are identifiers that ultimately lead to tree (i.e. directory) objects.
tag -> commit -> project-root-directory
Because commit objects always point to a directory tree object (the root directory of your project), any identifier that is "commit-ish" is, by definition, also "tree-ish". In other words, any identifier that leads to a commit object can also be used to lead to a (sub)directory tree object.
But since directory tree objects never point to commits in Git's versioning system, not every identifier that points to a (sub)directory tree can also be used to point to a commit. In other words, the set of "commit-ish" identifiers is a strict subset of the set of "tree-ish" identifiers.
As explained in the documentation (thanks to Trebor for helping me find it):
<tree>
Indicates a tree object name.
<commit>
Indicates a commit object name.
<tree-ish>
Indicates a tree, commit or tag object name. A command that takes a
<tree-ish>
argument ultimately wants to operate on a<tree>
object but automatically dereferences<commit>
and<tag>
objects that point at a<tree>
.<commit-ish>
Indicates a commit or tag object name. A command that takes a
<commit-ish>
argument ultimately wants to operate on a<commit>
object but automatically dereferences<tag>
objects that point at a<commit>
.
The set of tree-ish identifiers that cannot be used as commit-ish are
<rev>:<path>
, which leads directly to directory trees, not commit objects. For example,HEAD:subdirectory
.Sha1 identifiers of directory tree objects.
A tree-ish is a way of naming a specific tree which can be one of the following:
- References like:
- HEAD
- Tags
- Branch names
- Branch names with remotes, like
origin/somebranch
- Hash
- Short hashes
On top of that, any of the above can be appended with ^
, ~
. References can also use the @{}
notation for some additional features:
-
HEAD^
orHEAD^1
will be resolved to the first parent of HEAD. -
HEAD^2
will resolve to the second parent -
HEAD^3
will resolve to the third parent and so on, which is more rare and product of merges with the octopus strategy. -
HEAD~
orHEAD~1
will resolve to the first parent of head -
HEAD~2
will resolve to the first parent of the first parent of HEAD. This would be the same asHEAD^^
-
HEAD@{0}
will resolve to the current HEAD -
HEAD@{1}
will resolve to the previous head. This can only be used by references since it makes use of the reference log. In the case ofHEAD
every commit, merge, checkout will change the value of HEAD and thus add it to the log.git reflog HEAD
will display the reference log where you can see all the movements of HEAD and properly what@{1}
and so on will resolve to.
Most of the above can be further combined as long as it makes sense in your repository, for example: HEAD@{2}~3
, somebranch^2~4
, c00e66e~4^2
, anotherbranch~^~^~^
.
So any of the described above, and its combinations, is what is meant in the documentation as a tree-ish, which is just a way to say what tree (or revision) is the one that should be used for most of git commands.
More info in Revision Selection in the Git book.
You probably want
git archive master foo | tar -x -C ~/destination
The expression master/foo
does not make sense: master
is a branch name and foo
is a directory name, as I presume.
Edit: (Removed broken link. See comments.)
For definitions of <tree-ish>
and <commit-ish>
see the git(1) man page. You'll have to search for the terms. In general <tree-ish>
means a reference to a git tree object, but if you pass a type of object that references a tree (such as a commit or branch), git will automatically use the referenced tree.
From Git Glossary tree-ish is "A tree object or an object that can be recursively dereferenced to a tree object." commit, HEAD and tag are examples of tree-ish objects.