What is the naming standard for path components?
I keep getting myself in knots when I am manipulating paths and file names because I don’t follow a naming standard for path components.
Consider the following toy problem (Windows example, but hopefully the answer should be platform independent). You have been given the path of a folder:
C:\users\OddThinking\Documents\My Source\
You want to walk the folders underneath and compile all the .src files to .obj files.
At some point you are looking at the following path:
C:\users\OddThinking\Documents\My Source\Widget\foo.src
How would you name the following path components?
A. foo
B. foo.src
C. src
D. .src
E. C:\users\OddThinking\Documents\My Source\ (i.e. the absolute path of the root)
F. Widget\foo.src (i.e. the relative path of the file)
G. Widget\
H. C:\users\OddThinking\Documents\My Source\Widget\
I. C:\users\OddThinking\Documents\My Source\Widget\foo.src
Here is my attempt:
A. Base name? Basename?
B. File name? Filename? The difference is important when choosing identifier names, and I am never consistent here.
C. Extension?
D. Extension? Wait, that is what I called C. Should I avoid storing the dot, and just put it in when required? What if there is no dot on a particular file?
E. ?
F. ?
G. Folder? But isn’t this a Windows-specific term?
H. Path name? Pathname? Path?
I. File name? Wait, that is what I called C. Path name? Wait, that is what I called H.
Solution 1:
I think your search for a "standard" naming convention will be in vain. Here are my proposals, based on existing, well-known programs:
A) C:\users\OddThinking\Documents\My Source\Widget\foo.src
Vim calls it file root (:help filename-modifiers)
B) C:\users\OddThinking\Documents\My Source\Widget\foo.src
file name or base name
C) C:\users\OddThinking\Documents\My Source\Widget\foo.src (without dot)
file/name extension
D) C:\users\OddThinking\Documents\My Source\Widget\foo.src (with dot)
also file extension. Simply store without the dot, if there is no dot on a file, it has no extension
E) C:\users\OddThinking\Documents\My Source\Widget\foo.src
top of the tree
No convention, git calls it base directory
F) C:\users\OddThinking\Documents\My Source\Widget\foo.src
path from top of the tree to the leaf
relative path
G) C:\users\OddThinking\Documents\My Source\Widget\foo.src
one node of the tree
no convention, maybe a simple directory
H) C:\users\OddThinking\Documents\My Source\Widget\foo.src
dir name
I) C:\users\OddThinking\Documents\My Source\Widget\foo.src
full/absolute path
Solution 2:
Good question first of all, my +1. This thing bugged me when I had to create a slew of functions in Utility class once. GetFileName? or GetFullName? GetApplicationPath means full path or the directory name? and so on. I come from .NET background, so I think I can add little more to otherwise excellent answer by @blinry.
Summary: (In italics is what I would not use as a programmer)
-
Path: Path specifies a unique location in the file system (unless its relative path). Path name is less often used, but I would stick with path - it pretty much explains what it is. Path can point to a file or a folder or even nothing (C:\). Path can be:
-
Relative Path:
My Source\Widget\
is relative path as well asWidget\foo.src
. Self explanatory. -
Absolute Path or Full Path: Is the fully qualified path that points to the target. I tend to use the latter more often.
C:\users\OddThinking\Documents\My Source\Widget\foo.src
is hence full path. See at the end what I call full path that points to a file and that ends as a directory.
The wiki page and .NET naming for path is consistent.
-
Relative Path:
Root Path or Root Directory: Former is .NET convention while latter is more heard in UNIX circles. Though I like both I tend to use the former more. In windows, unlike UNIX, has many different root paths, one for each partition. Unix systems have one root directory which holds information on other directories and files. Eg.
C:\
is root path.-
Folder or Folder Name:
Widget
,OddThinking
etc in your case. This might be a Windows only convention (in fact its my own odd thinking :)), nevertheless I strongly object to blinry`s answer "Directory". Though for a normal user directory means the same as a folder (like subfolders, subdirectories), I believe from a technical angle "directory" should sound like a qualified address to the target and not the target itself. More below.-
Sub Folders: With respect to
users
OddThinking
andDocuments
are sub folders. -
Sub Directories: With respect to
users
OddThinking\
,OddThinking\Documents\
andOddThinking\Documents\My Source\Widget\
are sub directories. But we do not often need to bother about it, do we? -
Child Folder: With respect to
users
OddThinking
is a child folder (as well as sub folder) -
Parent Folder: For
OddThinking
users
is its parent folder (Just mentioning different terminologies, no big deal).
-
Sub Folders: With respect to
Directory or Directory Name: The former to use generally in real life, the latter to be in code. This refers to the fully qualified path (or simply full path) till the target's parent folder. In your case,
C:\users\OddThinking\Documents\My Source\Widget
(Yes a directory is never meant to point to a file). I use directory name in my code since directory is a class in .NET and Directory Name is what the library itself calls it. Its quite consistent with dirname used in UNIX systems.File Name or Basename: Name of the file along with extension. In your case:
foo.src
. I would say that for a non technical use I prefer file name (it is what it means for an end user) but for technical purposes I would strictly stick with basename. File Name is often used by MS, but I am surprised how they are not consistent not just in documentation but even in library. There filename could mean either basename or full path of the file. So I favour basename, that's what I call them in code. This page on wiki too says file name could mean either full path or the basename. Surprisingly even in .NET I can find the usage basename to mean the root name of the file.-
Extension or Filename Extension or File Extension: I like the last one. All refers to the same thing but what is it is again a matter of debate! Wiki says it is
src
while back then I remember reading that many of the languages interprets it as.src
. Note the dot. So once again my take is, for casual uses it doesn't matter what it is, but as a programmer I always see extension as.src
.Ok I might have tried to fetch some standard usages, but here are two of my conventions I follow. And it is about full paths.
I generally call a full path that point to a file as file path. To me file path is clear cut, it tells me what it is. Though with file name I find it as the name of the file, in my code I call it file name. It's also consistent with "directory name". From the technical side, name refers to the fully qualified name! Frustratingly .NET uses the term file name (so I have my case here) and sometimes file path for this.
I call a full path that ends as a directory a directory. In fact one can call any piece of address that doesn't point to a file a directory. So
C:\users\OddThinking\Documents\My Source\
is a directory,C:\users\OddThinking\
is a directory, or evenOddThinking\Documents\My Source\
(better to call it sub directory or even better relative path - all that depends on the context you are dealing with it). Well above I mentioned something different about directory which is directory name. Here is my take on it: I'll get a new path to avoid confusion. What is thisD:\Fruit\Apple\Pip\
? A directory. But if the question is what is the directory or even better directory name ofD:\Fruit\Apple\Pip\
, the answer isD:\Fruit\Apple\
. Hope its clear.
I would say it's better not to worry about the final two terms as that is what create the most confusion (for me personally). Just use the term full path!
To answer you:
-
with respect to the path you have given
A) No idea. Anyways I never needed to get that one alone.
B) basename
C) I would just call it file extension for time being, I am least worried since I never needed that alone to be named in my code.
D) file extension surely.
E) I do not think this is a general purpose requirement. No idea. In .NET base directory is the same as directory name.
F) relative path
G) folder (parent folder to basename
foo.src
)H) directory name
I) full path (or even file name)
-
in general (sorry for being a bit verbose, just to drive the point home) but assuming
foo.src
is indeed a fileA) NA
B) basename
C) NA
D) extension
E) directory or simply path
F) relative path
G) NA
H) directory or simply path
I) full path (or even file name)
Further driving with one example from my side:
-
Consider the path
C:\Documents and Settings\All Users\Application Data\s.sql
.-
C:\Documents and Settings\All Users\Application Data\s.sql
is the full path (which is a file name) -
C:\Documents and Settings\All Users\Application Data\
is the directory name.
-
-
Now consider the path
C:\Documents and Settings\All Users\Application Data
-
C:\Documents and Settings\All Users\Application Data
is the full path (which happens to be a directory) -
C:\Documents and Settings\All Users
is the directory name.
-
Two tips of mine:
I follow this rule of thumb that when it comes to addressing a full address irrespective of its type, I almost always call it "full path". This not only eliminates the use of two terminologies for file path and folder path, it also avoids the potential confusion if you are going to name that of file as file name (which for most users right away translates to basename). But yes if you have to be specific about the type of path, its better to name then file name or directory instead of more generic "path".
Whatever it is you would have your own idea in mind, be consistent with it throughout. Have a consensus among team members that this means this and not that.
Now that just from the circle I have some practice. A new brand of terms would be what is used on OS X and android machines. And all these are just about physical paths in filesystem. A whole new set of terminologies would arise in case of web addresses. I expect someone to fill the void in this same thread :) I would be glad to hear the convention with which you have went ahead..
Solution 3:
In C++, Boost.Filesystem has devised a nomenclature for the various parts of a path. See the path decomposition reference documentation for details, as well as this tutorial.
Here's a summary based on the tutorial. For:
- Windows path:
c:\foo\bar\baa.txt
- Unix path:
/foo/bar/baa.txt
you get:
Part Windows Posix
-------------- --------------- ---------------
Root name c: <empty>
Root directory \ /
Root path c:\ /
Relative path foo\bar\baa.txt foo/bar/baa.txt
Parent path c:\foo\bar /foo/bar
Filename baa.txt baa.txt
Stem baa baa
Extension .txt .txt
C++ standard ISO/IEC 14882:2017
Moreover Boost.Filesystem terminology has been adopted by C++17 => See std::filesystem
Function name Meaning
---------------- -------------------------------
root_name() Root-name of the path
root_directory() Root directory of the path
root_path() Root path of the path
relative_path() Path relative to the root path
parent_path() Path of the parent path
filename() Path without base directory (basename)
stem() Filename without extension
extension() Component after last dot
Solution 4:
The Pathlib standard library in Python has a simple naming convention for path components:
A. /x/y/z/foo.tar.gz > stem
.
B. /x/y/z/foo.tar.gz > name
.
C. /x/y/z/foo.tar.gz (excluding dot) > N/A.
D. /x/y/z/foo.tar.gz (including dot) > suffix
.
E. /x/y/z/foo.tar.gz > grand parent path
.
F. /x/y/z/foo.tar.gz > relative path to grand parent path
.
G. /x/y/z/foo.tar.gz > parent name
.
H. /x/y/z/foo.tar.gz > parent path
.
I. /x/y/z/foo.tar.gz > path
.
Solution 5:
No you're not crazy.
In Windows systems, sometimes the path of the directory containing the file is called path, which is how it was from the beginning. So, for example,
x:\dir1\dir2\myfile.txt
Windows:
--------
PATH: x:\dir1\dir2
FILE: myfile.txt
Unix/Linux:
-----------
PATH: /dir1/dir2/myfile.txt
FILE: myfile.txt
The Unix/Linux approach is a lot more logical, and that's what everyone mentioned above: path including the file name itself. However, if you type "call /?" in the Windows command line, you get this:
%~1 - expands %1 removing any surrounding quotes (")
%~f1 - expands %1 to a fully qualified path name
%~d1 - expands %1 to a drive letter only
%~p1 - expands %1 to a path only
%~n1 - expands %1 to a file name only
%~x1 - expands %1 to a file extension only
So there it is, "path only" and "file name only". At the same time, they refer to the whole string as "fully qualified path name" which is understood as drive letter plus path plus file name. So there's no real truth. It's futile. You've been betrayed.
Anyway,
To answer your question
This is how I'd name your examples:
A: -
B: basename
C: extension
D: -
E: -
F: -
G: -
H: pathname (or dirname or containing path)
I: full name
A-D-E-F have no simple nicknames. And since php is probably the most widely known cross-platform language, everyone understands "basename" and "dirname" so I'd stick with that naming. Full name is also obvious; full path would be a bit ambiguous but most of the time it means the very same thing.