sed: extracting value of a key-value pair in a URL query string
Even simpler, if you just want the abc
:
echo 'http://www.youtube.com/watch?v=abc&g=xyz' | awk -F'[=&]' '{print $2}'
If you want the xyz
:
echo 'http://www.youtube.com/watch?v=abc&g=xyz' | awk -F'[=&]' '{print $4}'
EXPLANATION:
awk
: is a scripting language that automatically processes input files line by line, splitting each line into fields. So, when you process a file withawk
, for each line, the first field is$1
, the second$2
etc up to$N
. By defaultawk
uses blanks as the field separator.-F'[=&]'
:-F
is used to change the field delimiter from spaces to something else. In this case, I am giving it a class of characters. Square brackets ([ ]
) are used by many languages to denote groups of characters. So, specifically,-F'[=&]'
means thatawk
should use both&
and=
as field delimiters.-
Therefore, given the input string from your question, using
&
and=
as delimiters,awk
will read the following fields:http://www.youtube.com/watch?v=abc&g=xyz |----------- $1 -------------| --- - --- | | | | | ̣----- $4 | -------- $3 ----------- $2
So, all you need to do is print whichever one you want
{print $4}
.
You said you also want to check that the string is a valid youtube URL, you can't do that with sed
since if it does not match the regex you give it, it will simply print the entire line. You can use a tool like Perl
to only print if the regex matches:
echo 'http://www.youtube.com/watch?v=abc&g=xyz' |
perl -ne 's/http.*www.youtube.com\/watch\?v=(.+?)&.+/$1/ && print'
Finally, to simply print abc
you can use the standard UNIX tool cut
:
echo 'http://www.youtube.com/watch?v=abc&g=xyz' |
cut -d '=' -f 2 | cut -d '&' -f 1
if you need "xyz" try this (GNU sed):
echo 'http://www.youtube.com/watch?v=abc&g=xyz' | sed 's/.*=\([[:alnum:]]*\).*/\1/'