Print sub-folder name and content of result.txt to .csv

Solution 1:

I think find is the right choice:

find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(cat $0)"' {} \;

Example run

$ echo r1 >a/b/result.txt
$ echo r2 >c/result.txt
$ tree
.
├── a
│   └── b
│       └── result.txt
└── c
    └── result.txt
$ find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(cat $0)"' {} \;
a,r1
c,r2

Explanations

This find command searches every file in or under the current directory of the name result.txt and executes the printf command in a bash subshell. The printf command prints the subdir's name, a comma and the file content followed by a \newline. If you want to write this output to a file, just append e.g. >final.csv to the command.

Even simpler

is the -printf approach suggested by steeldriver:

$ find */ -name 'result.txt' -printf '%H,' -exec cat {} \;
a/,r1
c/,r2

This prints an additional slash in the first column which you can easily remove by piping the output through e.g. sed 's|/,|,|'.

Merging multiline result.txt content into one cell

To replace newline characters with e.g. spaces just replace cat with sed ":a;N;\$!ba;s/\n/ /g" in one of the above commands, e.g.

$ find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(sed ":a;N;\$!ba;s/\n/ /g" $0)"' {} \;
a,r1 r1
c,r2

If you want some other string as the delimiter replace the / / part with /your_delimiter/, but keep the slashes.

Solution 2:

Well, here's a way (now edited to turn line breaks into spaces, thanks to this answer on Stack Overflow):

shopt -s globstar
n=0; for i in **/result.txt; do sed -e ":l;N;\$!bl;s/\n/ /g; s/.*/$((++n))\. "${i%%/*}"\t&/" "$i"; done

You can add a redirection to write to a file

n=0; for i in **/result.txt; do sed ":l;N;\$!bl;s/\n/ /g; s/.*/$((++n))\. "${i%%/*}"\t&/" "$i"; done > outfile

Notes

  • n=0 set a variable to increment
  • shopt -s globstar Turn on recursive globbing with ** to find all files in directories below this one (unset with shopt -u globstar afterwards, or exit the shell and start a new one)
  • :l set a label for this action
  • N read two lines into the pattern space (this allows us to use \n)
  • \$! not if this is the last line of the file... we have to escape $ because the whole command is double quoted so that the shell can expand $i etc. But this $ needs to be passed intact to sed, where it means "the last line of the file". I recommend using single quotes for sed scripts unless you have to pass shell variables in them.
  • bl ...branch to label (do it again)
  • s/old/new replace old with new
  • s/\n/ /g for all the newline characters in the pattern space (all but the last one), replace the newline with a space
  • .* any number of any characters (anything in the file)
  • $((++n)) increment n with each iteration of the loop
  • \. literal dot (commas are not treated specially by sed; they will be printed literally)
  • "${i%%/*}" the name of the first subdirectory of the current one in the path of the file we are dealing with (strip all characters after the first /)
  • & the matched pattern from the search section (anything in the file)
  • -- do not interpret leading - in subsequent arguments as prepending option flags. This prevents filenames beginning with - being interpreted as options. This is unnecessary in this specific case, because we are searching explicitly for result.txt and only files with this exact name will be passed to the loop. However, I have included it, in case anyone needs to reuse this script with a glob.

Here's a more readable version, which is also more portable (should work in all versions of sed) as is uses newlines instead of ; to separate commands:

#!/bin/bash

shopt -s globstar
n=0
for i in **/result.txt; do
         sed ":l      
              N        
              \$!bl     
              s/\n/ /g
              s/.*/$((++n))\.,"${i%%/*}",&/" -- "$i"
done > outfile

Solution 3:

Bash script solution

#!/bin/bash
# If $1 is not given, find will assume cwd
print_file(){
    local inputfile="$1"
    while IFS= read -r line || [ -n "$line" ];do
        printf "%s\\" "$line"
    done < "$inputfile"
}

get_file_info(){
    local filepath="$1"
    counter=$((counter+1))
    parent=${filepath%/*}
    if [ "$parent" = "$filepath"  ]; then
        parent="."
    fi
    printf "%d,%s," "$counter" "$parent"
}

main(){
    if [ -z "$1"  ];then
        set "."
    fi

    find "$1" -type f -name "result.txt" -print0 |
    while IFS= read -r -d ''  path
    do
        get_file_info "$path"
        print_file "$path"
        printf "\n"
    done
}

main "$@"

The way this works is that you should save this as file, for example results2csv.sh, make executable with chmod +x and run either by giving full path to the script or place it into ~/bin folder, run source ~/.bashrc and call the script by name.

Here's how this script works:

$ ./result2csv.sh things                                                    
1,things/thing2,to be or not to be\that's Boolean logic\
2,things/thing1,one potato\two potato\

Give the script the top-most directory, and it will go through the subdirectories finding the files and output the path to file in accordance with how you specified top most directory. So, for example if you specified ./things as top most, it would result in first line having ./thing/things2 as path to file. Newlines are replaced with backslashes to show file contents. Note that it will also assume current working directory "." if directory isn't specified.

$ cd things
$ ../result2csv.sh                                                          
1,./thing2,to be or not to be\that's Boolean logic\
2,./thing1,one potato\two potato\

All you have to do now, is call results2csv.sh directory > output.csv to output data into a file, and you're done