`os.symlink` vs `ln -s`

I need to create a symlink for every item of dir1 (file or directory) inside dir2. dir2 already exists and is not a symlink. In Bash I can easily achieve this by:

ln -s /home/guest/dir1/* /home/guest/dir2/

But in python using os.symlink I get an error:

>>> os.symlink('/home/guest/dir1/*', '/home/guest/dir2/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: [Errno 17] File exist

I know I can use subprocess and run ln command. I don't want that solution.

I'm also aware that workarounds using os.walk or glob.glob are possible, but I want to know if it is possible to do this using os.symlink.


Solution 1:

os.symlink creates a single symlink.

ln -s creates multiple symlinks (if its last argument is a directory, and there's more than one source). The Python equivalent is something like:

dst = args[-1]
for src in args[:-1]:
    os.symlink(src, os.path.join(dst, os.path.dirname(src)))

So, how does it work when you do ln -s /home/guest/dir1/* /home/guest/dir2/? Your shell makes that work, by turning the wildcard into multiple arguments. If you were to just exec the ln command with a wildcard, it would look for a single source literally named * in /home/guest/dir1/, not all files in that directory.

The Python equivalent is something like (if you don't mind mixing two levels together and ignoring a lot of other cases—tildes, env variables, command substitution, etc. that are possible at the shell):

dst = args[-1]
for srcglob in args[:-1]:
    for src in glob.glob(srcglob):
        os.symlink(src, os.path.join(dst, os.path.dirname(src)))

You can't do that with os.symlink alone—either part of it—because it doesn't do that. It's like saying "I want to do the equivalent of find . -name foo using os.walk without filtering on the name." Or, for that matter, I want to do the equivalent of ln -s /home/guest/dir1/* /home/guest/dir2/ without the shell globbing for me."

The right answer is to use glob, or fnmatch, or os.listdir plus a regex, or whatever you prefer.

Do not use os.walk, because that does a recursive filesystem walk, so it's not even close to shell * expansion.

Solution 2:

* is a shell extension pattern, which in your case designates "all files starting with /home/guest/dir1/".

But it's your shell's role to expand this pattern to the files it matches. Not the ln command's.

But os.symlink is not a shell, it's an OS call - hence, it doesn't support shell extension patterns. You'll have to do that work in your script.

To do so, you can use os.walk, or os.listdir. As indicated in the other answer, the appropriate call will depend on what you want to do. (os.walk wouldn't be the equivalent of *)


To convince yourself: run this command on an Unix machine in your terminal: python -c "import sys; print sys.argv" *. You'll see that it's the shell that's doing the matching.

Solution 3:

As suggested by @abarnert it's the shell that recognizes * and replaces it with all the items insside dir1. Therefore I think using os.listdir is the best choice:

for item in os.listdir('/home/guest/dir1'):
    os.symlink('/home/guest/dir1/' + item, '/home/guest/dir2/' + item)