Print sub-folder name and content of result.txt to .csv
Solution 1:
I think find
is the right choice:
find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(cat $0)"' {} \;
Example run
$ echo r1 >a/b/result.txt
$ echo r2 >c/result.txt
$ tree
.
├── a
│ └── b
│ └── result.txt
└── c
└── result.txt
$ find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(cat $0)"' {} \;
a,r1
c,r2
Explanations
This find
command searches every file in or under the current directory of the name result.txt
and exec
utes the printf
command in a bash
subshell. The printf
command prints the subdir's name, a comma and the file content followed by a \n
ewline. If you want to write this output to a file, just append e.g. >final.csv
to the command.
Even simpler
is the -printf
approach suggested by steeldriver:
$ find */ -name 'result.txt' -printf '%H,' -exec cat {} \;
a/,r1
c/,r2
This prints an additional slash in the first column which you can easily remove by piping the output through e.g. sed 's|/,|,|'
.
Merging multiline result.txt
content into one cell
To replace newline characters with e.g. spaces just replace cat
with sed ":a;N;\$!ba;s/\n/ /g"
in one of the above commands, e.g.
$ find */ -name "result.txt" -exec bash -c 'printf "%s,%s\n" "${0%%/*}" "$(sed ":a;N;\$!ba;s/\n/ /g" $0)"' {} \;
a,r1 r1
c,r2
If you want some other string as the delimiter replace the / /
part with /your_delimiter/
, but keep the slashes.
Solution 2:
Well, here's a way (now edited to turn line breaks into spaces, thanks to this answer on Stack Overflow):
shopt -s globstar
n=0; for i in **/result.txt; do sed -e ":l;N;\$!bl;s/\n/ /g; s/.*/$((++n))\. "${i%%/*}"\t&/" "$i"; done
You can add a redirection to write to a file
n=0; for i in **/result.txt; do sed ":l;N;\$!bl;s/\n/ /g; s/.*/$((++n))\. "${i%%/*}"\t&/" "$i"; done > outfile
Notes
-
n=0
set a variable to increment -
shopt -s globstar
Turn on recursive globbing with**
to find all files in directories below this one (unset withshopt -u globstar
afterwards, or exit the shell and start a new one) -
:l
set a label for this action -
N
read two lines into the pattern space (this allows us to use\n
) -
\$!
not if this is the last line of the file... we have to escape$
because the whole command is double quoted so that the shell can expand$i
etc. But this$
needs to be passed intact tosed
, where it means "the last line of the file". I recommend using single quotes forsed
scripts unless you have to pass shell variables in them. -
bl
...branch to label (do it again) -
s/old/new
replaceold
withnew
-
s/\n/ /g
for all the newline characters in the pattern space (all but the last one), replace the newline with a space -
.*
any number of any characters (anything in the file) -
$((++n))
incrementn
with each iteration of the loop -
\.
literal dot (commas are not treated specially bysed
; they will be printed literally) -
"${i%%/*}"
the name of the first subdirectory of the current one in the path of the file we are dealing with (strip all characters after the first/
) -
&
the matched pattern from the search section (anything in the file) -
--
do not interpret leading-
in subsequent arguments as prepending option flags. This prevents filenames beginning with-
being interpreted as options. This is unnecessary in this specific case, because we are searching explicitly forresult.txt
and only files with this exact name will be passed to the loop. However, I have included it, in case anyone needs to reuse this script with a glob.
Here's a more readable version, which is also more portable (should work in all versions of sed
) as is uses newlines instead of ;
to separate commands:
#!/bin/bash
shopt -s globstar
n=0
for i in **/result.txt; do
sed ":l
N
\$!bl
s/\n/ /g
s/.*/$((++n))\.,"${i%%/*}",&/" -- "$i"
done > outfile
Solution 3:
Bash script solution
#!/bin/bash
# If $1 is not given, find will assume cwd
print_file(){
local inputfile="$1"
while IFS= read -r line || [ -n "$line" ];do
printf "%s\\" "$line"
done < "$inputfile"
}
get_file_info(){
local filepath="$1"
counter=$((counter+1))
parent=${filepath%/*}
if [ "$parent" = "$filepath" ]; then
parent="."
fi
printf "%d,%s," "$counter" "$parent"
}
main(){
if [ -z "$1" ];then
set "."
fi
find "$1" -type f -name "result.txt" -print0 |
while IFS= read -r -d '' path
do
get_file_info "$path"
print_file "$path"
printf "\n"
done
}
main "$@"
The way this works is that you should save this as file, for example results2csv.sh
, make executable with chmod +x
and run either by giving full path to the script or place it into ~/bin
folder, run source ~/.bashrc
and call the script by name.
Here's how this script works:
$ ./result2csv.sh things
1,things/thing2,to be or not to be\that's Boolean logic\
2,things/thing1,one potato\two potato\
Give the script the top-most directory, and it will go through the subdirectories finding the files and output the path to file in accordance with how you specified top most directory. So, for example if you specified ./things
as top most, it would result in first line having ./thing/things2
as path to file. Newlines are replaced with backslashes to show file contents. Note that it will also assume current working directory "." if directory isn't specified.
$ cd things
$ ../result2csv.sh
1,./thing2,to be or not to be\that's Boolean logic\
2,./thing1,one potato\two potato\
All you have to do now, is call results2csv.sh directory > output.csv
to output data into a file, and you're done