Is it possible with Gedit or the command line to modify every fourth line of a text file?
Solution 1:
You could use a command-line editor such as sed
sed 'N;N;N;s/\n/\t/g' file > file.tsv
or, more programatically, by adding backslash line continuation characters to each of the lines you want to join using GNU sed's n skip m
address operator and following it with the classic one-liner for joining continued lines:
sed '0~4! s/$/\t\\/' file | sed -e :a -e '/\\$/N; s/\\\n//; ta'
See for example Sed One-Liners Explained :
Append a line to the next if it ends with a backslash "\".
sed -e :a -e '/\\$/N; s/\\\n//; ta'
However IMHO itwould be easier with one of the other standard text-processing utilities e.g.
paste - - - - < file > file.tsv
(the number of -
will correspond to the number of columns) or
pr -aT -s$'\t' -4 file > file.tsv
(you can omit the -s$'\t
if you don't mind the output to be separated by multiple tabs).
The strange re-import behavior that you are observing is almost certainly because the original file has Windows-style CRLF line endings. If you need to work with files from Windows, then you can roll the conversion into the command in various ways e.g.
tr -d '\r' < file.csv | paste - - - -
or
sed 'N;N;N;s/\r\n/\t/g' file.csv
The former will remove ALL carriage returns whereas the latter will preserve a CR at the end of each of the new lines (which may be what you want if the intended end user is on Windows).
Solution 2:
You can use xargs
to always group four lines into one, separated with a single space each:
xargs -d '\n' -n4 < inputfile.txt
-d '\n'
sets the input delimiter to a newline character, otherwise it would also break on spaces. If you only have one word per input line anyway, you can even omit this.-n4
sets the argument number (the number of input items per output line) to 4.
Output:
Dog Cat Fish Lizard
Wolf Lion Shark Gecko
Coyote Puma Eel Iguana
Or if you want tabs as separators instead of a space, you can replace them afterwards. However, if you had spaces in your input lines, those would get replaced too:
xargs -d '\n' -n4 | tr ' ' '\t'
Output (look depending on browser/terminal's tab width):
Dog Cat Fish Lizard
Wolf Lion Shark Gecko
Coyote Puma Eel Iguana
Solution 3:
You could also use:
awk -v ORS="" '{print $1; print NR%4==0?"\n":"\t"}' file > file.tsv
The two awk built-in variables are:
-
ORS
: Output Record Separator(default=newline). It is added at the end of each print command. -
NR
: Number of the current Row awk is processing.
This command will, for each line, display the content of the first (and here only) column. Then it choose to add a newline or a tab by testing the remainder of the division of NR
by 4.
Solution 4:
Another shortest awk
approach:
awk '{printf $0 (NR%4?"\t":"\n")}' infile
This printf the only one column followed by next and next and ... and a Tab \t
character after each but will printf a \n
ewline character when Number of Record was factor of 4 (where NR%4
will return 0 (false) which is what Ternary Operator condition(s)?when-true:when-false
is doing.)