Grep and Regex: filter subdomains in a file

First of all sorry I should learn some grep and regex instead of asking that question but I have a little shortage of time for now. And I am definitely going to learn egrep eventually.

So here is the input:

202.125.132.76          host    av.google.com
202.147.187.10          host    cms1web.google.com
202.147.187.10          host    cms2web.google.com
    "autodiscover.google.com
    "cms1web.google.com
    "cms2web.google.com
    "dialin.google.com
 - afghanistan.google.com
  - autodiscover.google.com
  - bangladesh.google.com
  - bdbkashonline.google.com
  - cms1web.google.com
*.google.com
*.ibank.google.com
*.ibankintl.google.com
*.itrade.google.com
202.125.133.232 403     host    autodiscover.google.com
104.40.82.191 - EnterpriseEnrollment.google.com
107.154.104.16 - iTrade.google.com
107.154.108.2 - MIS.google.com
116.71.129.169  testpaymentapi.google.com
119.159.231.12          host    av.google.com

Output should be

av.google.com
cms1web.google.com
cms2web.google.com
autodiscover.google.com
so on.......

I want only *.*google.com not any thing else in result, line by line

I don't want that Apostrophes and hyphens at all only subdomains as shown above.

Thanks if you could help me :)


$ grep -Po '^[^-*"]*?\K[[:alnum:]-]+\.google\.com$' input
av.google.com
cms1web.google.com
cms2web.google.com
autodiscover.google.com
testpaymentapi.google.com
av.google.com
  • non-greedily match and discard a sequence of characters not including -, *, or "

then

  • match and output a sequence of alphanumeric characters and hyphens (although your input doesn't have any, they are legal in a domain name) followed by .google.com