Look through a file and print text from specific lines
Here's a sed
approach:
$ sed -nE '1s/.{11}(.{8}).*/\1/p; 3s/.{3}(.{4}).*/\1/p' file
Ethernet
t6 a
Explanation
The -n
suppresses normal output (normal is to print every input line) so that it only prints when told to. The -E
enables extended regular expressions.
The sed
script has two commands, both using the substitution operator (s/original/replacement/
). The 1s/.{11}(.{8}).*/\1/p
will only run on the 1st line (that's what the 1s
does), and will match the 1st 11 characters of the line (.{11}
), then it captures the next 8 ((.{8})
, the parentheses are a "capture group") and then everything else until the end of the line (.*
). All this is replaced with whatever was in the capture group (\1
; if there were a second capture group, it would be \2
etc.). Finally, the p
at the end (s/foo/bar/p
) causes the line to be printed after the substitution has been made. This results in only the target 8 characters being output.
The second command is the same general idea except that it will only run on the 3rd line (3s
) and will keep the 4 characters starting from the 4th.
You could also do the same thing with perl
:
$ perl -ne 'if($.==1){s/.{11}(.{8}).*/\1/}
elsif($.==3){s/.{3}(.{4}).*/\1/}
else{next}; print; ' file
Ethernet
t6 a
Explanation
The -ne
means "read the input file line by line and apply the script given by -e
to each line. The script is the same basic idea as before. The $.
variable holds the current line number so we check if the line number is either 1
or 3
and, if so, run the substitution, else skip. Therefore the print
will only be run for those two lines since all others will be skipped.
Of course, this is Perl, so TIMTOWTDI:
$ perl -F"" -lane '$. == 1 && print @F[11..19]; $.==3 && print @F[3..6]' file
Ethernet
t6 a
Explanation
Here, the -a
means "split each input line on the character given by -F
and save as the array @F
. Since the character given is empty, this will save each character of the input line as an element in @F
. Then, we print elements 11-19 (arrays start counting at 0
) for the 1st line and 3-7 for the 3rd.
awk approach:
$ awk 'NR==1{print substr($0,12,8)};NR==3{print substr($0,4,4)}' input.txt
Ethernet
t6 a
Uses NR
for determining line (in awk terminology - record) number, and accordingly print substring of the line. substr()
function is in format
substr(string,starting position,how much offset)
Python
$ python -c 'import sys
> for index,line in enumerate(sys.stdin,1):
> if index == 1:
> print line[11:19]
> if index == 3:
> print line[3:7]' < input.txt
Ethernet
t6 a
This uses <
shell operator to redirect input stream to python process from the input file. Note that strings in python are 0-indexed, hence you need to shift your desired character numbers all by 1.
portable shell way
This works in ksh
, dash
, bash
. Relies only on shell utilities, nothing external.
#!/bin/sh
rsubstr(){
i=0;
while [ $i -lt $2 ];
do
rmcount="${rmcount}?"
i=$(($i+1))
done;
echo "${1#$rmcount}"
}
lsubstr(){
printf "%.${2}s\n" "$1"
}
line_handler(){
case $2 in
1) lsubstr "$(rsubstr "$1" 11)" 8 ;;
3) lsubstr "$(rsubstr "$1" 3)" 5 ;;
esac
}
readlines(){
line_count=1
while IFS= read -r line;
do
line_handler "$line" "$line_count"
line_count=$(($line_count+1))
done < $1
}
readlines "$1"
And it works like so:
$ ./get_line_substrings.sh input.txt
Ethernet
t6 ad