How to get the URL from a file using a shell script
I have a file which consists of a URL. I'm trying to get the URL from that file using a shell script.
In the file, the URL is like this:
('URL', 'http://url.com');
I tried to use the following:
cat file.php | grep 'URL' | awk '{ print $2 }'
It gives the output as:
'http://url.com');
But I need to get only url.com
in a variable inside the shell script. How can I accomplish this?
You can do everything with a simple grep
:
grep -oP "http://\K[^']+" file.php
From man grep
:
-P, --perl-regexp
Interpret PATTERN as a Perl regular expression (PCRE, see
below). This is highly experimental and grep -P may warn of
unimplemented features.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
The trick is to use \K
which, in Perl regex, means discard everything matched to the left of the \K
. So, the regular expression looks for strings starting with http://
(which is then discarded because of the \K
) followed by as many non-'
characters as possible. Combined with -o
, this means that only the URL will be printed.
You could also do it in Perl directly:
perl -ne "print if s/.*http:\/\/(.+)\'.*/\$1/" file.php\
Something like this?
grep 'URL' file.php | rev | cut -d "'" -f 2 | rev
or
grep 'URL' file.php | cut -d "'" -f 4 | sed s/'http:\/\/'/''/g
To strip out http://.
Try this,
awk -F// '{print $2}' file.php | cut -d "'" -f 1