How to grab only the 2nd-level domains from a list of subdomains

What I need

I have a list of domains like so:

a.example.com
b.foo.com
a.b.bar.com

I only want the output to grab the second-level domains and nothing else, i.e., no 3rd-level or higher. This is what I'm looking for from my example list above:

example.com
foo.com
bar.com

What I tried

I've tried using sed, awk, and cut as follows:

sed

cat domains.txt | sed 's/\.$//g'
cat domains.txt | sed -r 's/^(.*)_/\1\\/; s/.$//g'  # this removes the last character for some reason

awk

awk '{ sub(/\.$/, ""); print $NF }' domains.txt
cat domains.txt | awk -F\. '{print $1,$2}' | tr ' ' '.' # won't work since there are 4th level domains

cut

cat domains.txt | cut -d '.' -f[field] # won't work since there are 4th level domains

In cases where you need to start your match from the right, you can use an end anchor $ to fixate the pattern to the end of the line.

grep:

grep -Po '[^.]+\.[^.]+$' domains.txt

sed:

sed  's/.*\.\([^.]\+\.[^.]\+\)$/\1/' domains.txt

awk has a pre-defined variable named NF holding the number of fields for the current record. You may combine the NF variable with the field specifier $ to reference the value instead.

awk:

awk -F . -vOFS=. '{print $(NF-1), $NF}' domains.txt

You can also reverse the text for commands like: read or cut that purely reads from left to right.

rev, cut:

rev domains.txt | cut -d . -f1,2 | rev

Bash only example:

while read -r; do \
    printf %s\\n ${REPLY/#${REPLY%.*.*}.}; \
done < domains.txt

How to grab only the 2nd-level domains from a list of subdomains

What I need

What I tried

Related

Recent Posts