Grep and Regex: filter subdomains in a file
First of all sorry I should learn some grep and regex instead of asking that question but I have a little shortage of time for now. And I am definitely going to learn egrep eventually.
So here is the input:
202.125.132.76 host av.google.com
202.147.187.10 host cms1web.google.com
202.147.187.10 host cms2web.google.com
"autodiscover.google.com
"cms1web.google.com
"cms2web.google.com
"dialin.google.com
- afghanistan.google.com
- autodiscover.google.com
- bangladesh.google.com
- bdbkashonline.google.com
- cms1web.google.com
*.google.com
*.ibank.google.com
*.ibankintl.google.com
*.itrade.google.com
202.125.133.232 403 host autodiscover.google.com
104.40.82.191 - EnterpriseEnrollment.google.com
107.154.104.16 - iTrade.google.com
107.154.108.2 - MIS.google.com
116.71.129.169 testpaymentapi.google.com
119.159.231.12 host av.google.com
Output should be
av.google.com
cms1web.google.com
cms2web.google.com
autodiscover.google.com
so on.......
I want only *.*google.com not any thing else in result, line by line
I don't want that Apostrophes and hyphens at all only subdomains as shown above.
Thanks if you could help me :)
$ grep -Po '^[^-*"]*?\K[[:alnum:]-]+\.google\.com$' input
av.google.com
cms1web.google.com
cms2web.google.com
autodiscover.google.com
testpaymentapi.google.com
av.google.com
- non-greedily match and discard a sequence of characters not including
-
,*
, or"
then
- match and output a sequence of alphanumeric characters and hyphens (although your input doesn't have any, they are legal in a domain name) followed by
.google.com