Where does metadata go when you save a file?
Solution 1:
It's not stored in that file. It's stored in the filesystem, and all parameters are copied manually one-by-one (though some cannot be copied at all).
That is, most operating systems don't really have a "copy file with metadata" call. The file-copy program just creates a new file named foobar.py
, copies the whole 0 bytes of data, then uses utime() or SetFileTime() to make its modification time look the same as the original's. Likewise, file permissions would be "copied" by setting them anew using chmod() or by copying the POSIX ACL attribute.
Some metadata isn't copied. Setting ownership requires root privileges, so copies of someone else's files belong to you and occupy your disk quota. The ctime (attribute change time) is impossible to set manually on Unixes; btime (birth/creation time) is usually not copied either.
Compare cp -a foo bar
(which copies metadata) and cp foo bar
(which doesn't):
$ strace -v cp foo bar … open("foo", O_RDONLY) = 3 open("bar", O_WRONLY|O_TRUNC) = 4 read(3, "test\n", 131072) = 5 write(4, "test\n", 5) = 5 read(3, "", 131072) = 0 close(4) = 0 close(3) = 0 …
$ strace -v cp -a foo bar … -- original metadata is retrieved lstat("foo", {st_dev=makedev(254, 0), st_ino=60569468, st_mode=S_IFREG|0644, st_nlink=1, st_uid=1000, st_gid=1000, st_blksize=4096, st_blocks=8, st_size=5, st_atime=2016-12-28T09:16:59+0200.879714332, st_mtime=2016-12-28T09:16:55+0200.816363098, st_ctime=2016-12-28T09:16:55+0200.816363098}) = 0 -- data is copied open("foo", O_RDONLY|O_NOFOLLOW) = 3 open("bar", O_WRONLY|O_TRUNC) = 4 read(3, "test\n", 131072) = 5 write(4, "test\n", 5) = 5 read(3, "", 131072) = 0 -- modifiction time is copied utimensat(4, NULL, [{tv_sec=1482909419, tv_nsec=879714332}, {tv_sec=1482909415, tv_nsec=816363098}], 0) = 0 -- ownership is copied (only with 'sudo [strace] cp') fchown(4, 1000, 1000) = 0 -- extended attributes are copied (xdg.origin.url is set by browsers, wget) flistxattr(3, NULL, 0) = 0 flistxattr(3, "user.xdg.origin.url\0", 20) = 20 fgetxattr(3, "user.xdg.origin.url", "https://superuser.com/", 22) = 22 fsetxattr(4, "user.xdg.origin.url", "https://superuser.com/", 22, 0) = 0 -- POSIX ACLs are not present, so a basic ACL is built from st_mode -- (in this case, a simple fchmod() would work as well) fgetxattr(3, "system.posix_acl_access", 0x7ffc87a50be0, 132) = -1 ENODATA (No data available) fsetxattr(4, "system.posix_acl_access", "\2\0\0\0\1\0\6\0\377\377\377\377\4\0\4\0\377\377\377\377 \0\4\0\377\377\377\377", 28, 0) = 0 close(4) = 0 close(3) = 0 …
Solution 2:
It generally differs from filesystem to filesystem where the metadata is stored. On the ext2-family of filesystems, the metadata you mentioned (owner, group, permissions, time) are stored in the inode. The inode also stores (pointers to) the blocks the file occupies on disk. The inode does not store the filename.
You can access this data with the stat
system call (man 2 stat
), and use the stat
tool to print it (man stat
). A detailed description of the inode fields can be found in linux/include/linux/fs.h
in the kernel source.
There are other kinds of metadata (e.g. ACL permissions) that are stored in different places.
Metadata is not copied by default when you copy the file. Instead, a new file with default metadata values is created. There are various options to cp
(-p
, --preserve
) which instruct cp
to also copy metadata, by reading the old metadata with stat
and modifying the new metadata accordingly.