Extract numbers from string using shell
Using shell script, how do I extract two sets of numbers from a string like "R14C11"? I'd like to get the result as an AppleScript list.
R14C11 -> {14, 11}
R5C9 -> {5, 9}
"R" and "C" will be constants but the number of digits in each set of numbers can vary.
There are a lot of ways to do this, one is to use sed
echo R5C9 | sed -E 's|R(.*)C(.*)|{\1, \2}|'
Or, if you want to ensure that only input with the correct format will be matched
echo R5C9 | sed -E 's|R([[:digit:]]+)C([[:digit:]]+)|{\1, \2}|'
Some explanations:
-
-E
enables extended regular expressions, which among other things makes the matching of the pattern easier -
s|SOURCE|TARGET|
is the substitution command to transformSOURCE
intoTARGET
-
R([[:digit:]]+)C([[:digit:]]+)
is the source pattern we are looking for: AnR
followed by at least one digit[[:digit:]]+
followed byC
followed again by at least one digit - The target replaces the matched source, with
\1
standing for the text matched within the first()
in the source,\2
for the second
You can also just use bash
itself
[[ "R5C9" =~ R([0-9]+)C([0-9]+) ]] && echo "{${BASH_REMATCH[1]}, ${BASH_REMATCH[2]}}"
-
[[ "R5C9" =~ R([0-9]+)C([0-9]+) ]]
matches the text, withR([0-9]+)C([0-9]+)
being basically the same as the source pattern above - Matching parts within
()
get assigned to the shell arrayBASH_REMATCH
- The
echo
is only executed if the match ([[ ... ]]
) was successfull, and then prints the reformatted expression (which is a bit confusing to read because the various{}
mean different things...)