How to check whether a directory is a sub directory of another directory
I like to write a template system in Python, which allows to include files.
e.g.
This is a template You can safely include files with safe_include`othertemplate.rst`
As you know, including files might be dangerous. For example, if I use the template system in a web application which allows users to create their own templates, they might do something like
I want your passwords: safe_include`/etc/password`
So therefore, I have to restrict the inclusion of files to files which are for example in a certain subdirectory (e.g. /home/user/templates
)
The question is now: How can I check, whether /home/user/templates/includes/inc1.rst
is in a subdirectory of /home/user/templates
?
Would the following code work and be secure?
import os.path
def in_directory(file, directory, allow_symlink = False):
#make both absolute
directory = os.path.abspath(directory)
file = os.path.abspath(file)
#check whether file is a symbolic link, if yes, return false if they are not allowed
if not allow_symlink and os.path.islink(file):
return False
#return true, if the common prefix of both is equal to directory
#e.g. /a/b/c/d.rst and directory is /a/b, the common prefix is /a/b
return os.path.commonprefix([file, directory]) == directory
As long, as allow_symlink
is False, it should be secure, I think. Allowing symlinks of course would make it insecure if the user is able to create such links.
UPDATE - Solution
The code above does not work, if intermediate directories are symbolic links.
To prevent this, you have to use realpath
instead of abspath
.
UPDATE: adding a trailing / to directory to solve the problem with commonprefix() Reorx pointed out.
This also makes allow_symlink
unnecessary as symlinks are expanded to their real destination
import os.path
def in_directory(file, directory):
#make both absolute
directory = os.path.join(os.path.realpath(directory), '')
file = os.path.realpath(file)
#return true, if the common prefix of both is equal to directory
#e.g. /a/b/c/d.rst and directory is /a/b, the common prefix is /a/b
return os.path.commonprefix([file, directory]) == directory
Python 3's pathlib
module makes this straightforward with its Path.parents attribute. For example:
from pathlib import Path
root = Path('/path/to/root')
child = root / 'some' / 'child' / 'dir'
other = Path('/some/other/path')
Then:
>>> root in child.parents
True
>>> other in child.parents
False
Problems with many of the suggested methods
If you're going to test for directory parentage with string comparison or os.path.commonprefix
methods, these are prone to errors with similarly-named paths or relative paths. For example:
-
/path/to/files/myfile
would be shown as a child path of/path/to/file
using many of the methods. -
/path/to/files/../../myfiles
would not be shown as a parent of/path/myfiles/myfile
by many of the methods. In fact, it is.
The previous answer by Rob Dennis provides a good way to compare path parentage without encountering these problems. Python 3.4 added the pathlib
module which can perform these kind of path operations in a more sophisticated way, optionally without referencing the underlying OS. jme has described in another previous answer how to use pathlib
for the purpose of accurately determining if one path is a child of another. If you prefer not to use pathlib
(not sure why, it's pretty great) then Python 3.5 introduced a new OS-based method in os.path
that allows you to do perform path parent-child checks in a similarly accurate and error-free manner with a lot less code.
New for Python 3.5
Python 3.5 introduced the function os.path.commonpath
. This is a method that is specific to the OS that the code is running on. You can use commonpath
in the following way to accurately determine path parentage:
def path_is_parent(parent_path, child_path):
# Smooth out relative path names, note: if you are concerned about symbolic links, you should use os.path.realpath too
parent_path = os.path.abspath(parent_path)
child_path = os.path.abspath(child_path)
# Compare the common path of the parent and child path with the common path of just the parent path. Using the commonpath method on just the parent path will regularise the path name in the same way as the comparison that deals with both paths, removing any trailing path separator
return os.path.commonpath([parent_path]) == os.path.commonpath([parent_path, child_path])
Accurate one-liner
You can combine the whole lot into a one-line if statement in Python 3.5. It's ugly, it includes unnecessary duplicate calls to os.path.abspath
and it definitely won't fit in the PEP 8 79-character line-length guidelines, but if you like that kind of thing, here goes:
if os.path.commonpath([os.path.abspath(parent_path_to_test)]) == os.path.commonpath([os.path.abspath(parent_path_to_test), os.path.abspath(child_path_to_test)]):
# Yes, the child path is under the parent path
New for Python 3.9
pathlib
has a new method on PurePath
called is_relative_to
which performs this function directly. You can read the python documentation on how is_relative_to
works if you need to see how to use it. Or you can see my other answer for a more full description of how to use it.