How do I copy the contents of every file in a list into another file?

I have a list of filenames inside a file called list_of_files.txt.

I want to copy the contents of each file in that list into another file called all_compounds.sdf.

How should I do this from the command line?


Solution 1:

Don't use simple command substitution to get filenames (that could easily break with spaces and other special characters). Use something like xargs:

xargs -d '\n' -a list_of_files.txt cat > all_compounds.sdf

Or a while read loop:

while IFS= read -r file; do cat "$file"; done < list_of_files.txt > all_compounds.sdf

To use command substitution safely, at least set IFS to just the newline and disable globbing (wildcard expansion):

(set -f; IFS=$'\n'; cat $(cat list_of_files.txt) > all_compounds.sdf)

The surrounding parentheses () are to run this in a subshell, so that your current shell isn't affected by these changes.

Solution 2:

Quick and dirty way...

cat $(cat list_of_files.txt) >> all_compounds.sdf

Please note: this only works if the filenames in your list are very well behaved - things will go wrong if they have spaces, newlines, or any characters that have special meaning to the shell - use this answer instead for reliable results)

Notes

  • cat concatenates files. It also prints their contents.

  • Using command substitution command2 $(command1) you can pass the output of command1 (cat list...) to command2 (cat) which concatenates the files.

  • Then use redirection >> to send the output to a file instead of printing to stdout. If you want to see the output, use tee instead:

      cat $(cat list_of_files.txt) | tee -a all_compounds.sdf
    

(I have used >> instead of > and tee with the -a switch in case your file already exists - this appends to the file instead of overwriting it, if it already exists)

Solution 3:

While GNU awk is a text processing utility, it allows running external shell commands via system() call. We can utilize that to our advantage like so:

$ awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt                                                        

The idea here is simple: we read the file line by line, and out of each line we create formatted string cat "File name.txt", which is then passed to system().

And here it is in action:

$ ls
file1.txt  file2.txt  file3 with space.txt  file_list.txt


$ awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt                                                        
Hi, I'm file2
Hi, I'm file1
Hi, I'm file3

So we've done the big part of the task there already - we printed all the files on the list. The rest is simple : redirect final output to file with > operator into the summary file.

awk '{cmd=sprintf("cat \"%s\"",$0); system(cmd)}' file_list.txt > output.txt