Is it possible with Gedit or the command line to modify every fourth line of a text file?

Solution 1:

You could use a command-line editor such as sed

sed 'N;N;N;s/\n/\t/g' file > file.tsv

or, more programatically, by adding backslash line continuation characters to each of the lines you want to join using GNU sed's n skip m address operator and following it with the classic one-liner for joining continued lines:

sed '0~4! s/$/\t\\/' file | sed -e :a -e '/\\$/N; s/\\\n//; ta'

See for example Sed One-Liners Explained :

  1. Append a line to the next if it ends with a backslash "\".

    sed -e :a -e '/\\$/N; s/\\\n//; ta'
    

However IMHO itwould be easier with one of the other standard text-processing utilities e.g.

paste - - - - < file > file.tsv

(the number of - will correspond to the number of columns) or

pr -aT -s$'\t' -4 file > file.tsv

(you can omit the -s$'\t if you don't mind the output to be separated by multiple tabs).


The strange re-import behavior that you are observing is almost certainly because the original file has Windows-style CRLF line endings. If you need to work with files from Windows, then you can roll the conversion into the command in various ways e.g.

tr -d '\r' < file.csv | paste - - - -

or

sed 'N;N;N;s/\r\n/\t/g' file.csv

The former will remove ALL carriage returns whereas the latter will preserve a CR at the end of each of the new lines (which may be what you want if the intended end user is on Windows).

Solution 2:

You can use xargs to always group four lines into one, separated with a single space each:

xargs -d '\n' -n4 < inputfile.txt

-d '\n' sets the input delimiter to a newline character, otherwise it would also break on spaces. If you only have one word per input line anyway, you can even omit this.
-n4 sets the argument number (the number of input items per output line) to 4.

Output:

Dog Cat Fish Lizard
Wolf Lion Shark Gecko
Coyote Puma Eel Iguana

Or if you want tabs as separators instead of a space, you can replace them afterwards. However, if you had spaces in your input lines, those would get replaced too:

xargs -d '\n' -n4 | tr ' ' '\t'

Output (look depending on browser/terminal's tab width):

Dog Cat Fish    Lizard
Wolf    Lion    Shark   Gecko
Coyote  Puma    Eel Iguana

Solution 3:

You could also use:

awk -v ORS="" '{print $1; print NR%4==0?"\n":"\t"}' file > file.tsv 

The two awk built-in variables are:

  • ORS: Output Record Separator(default=newline). It is added at the end of each print command.
  • NR: Number of the current Row awk is processing.

This command will, for each line, display the content of the first (and here only) column. Then it choose to add a newline or a tab by testing the remainder of the division of NR by 4.

Solution 4:

Another shortest awk approach:

awk '{printf $0 (NR%4?"\t":"\n")}' infile

This printf the only one column followed by next and next and ... and a Tab \t character after each but will printf a \newline character when Number of Record was factor of 4 (where NR%4 will return 0 (false) which is what Ternary Operator condition(s)?when-true:when-false is doing.)