Extract version number from a string

I have a string with components and version numbers:

data-c(kuh-small1);divider-bin-1.4.4;divider-conf-1.3.3-w(1,16);storage-bin-1.5.4;storage-conf-1.5.0-w(1);worker-bin-4.5.1;worker-conf-4.4.1-c(kuh)-win2

For a shell script, I need to extract the version number of the divider binary. So I need to yield:

1.4.4

What would be a good way to do this? with sed?


Following Kent's answers, this can work:

grep -Po '(?<=divider-bin-)\d.\d.\d'

and even better:

grep -Po '(?<=divider-bin-)[^;]+'

it greps from divider-bin- until it find the ; character. This way any NNN.NNN. ... . NNN format will work (no matter how many blocks of NN).

Test:

$ echo "data-c(kuh-small1);divider-bin-1.4.4;divider-conf-1.3.3-w(1,16);storage-bin-1.5.4;storage-conf-1.5.0-w(1);worker-bin-4.5.1;worker-conf-4.4.1-c(kuh)-win2" | grep -Po '(?<=divider-bin-)[^;]+'
1.4.4
$ echo "data-c(kuh-small1);divider-bin-1.4;divider-conf-1.3.3-w(1,16);storage-bin-1.5.4;storage-conf-1.5.0-w(1);worker-bin-4.5.1;worker-conf-4.4.1-c(kuh)-win2" | grep -Po '(?<=divider-bin-)[^;]+'
1.4

This general answer should work in all cases

I use one of these three one-line perl commands depending on the expected input string:

  1. The input string always contain one single version

     perl -pe '($_)=/([0-9]+([.][0-9]+)+)/' 
    
  2. To extract several versions (several lines)

     perl -pe 'if(($_)=/([0-9]+([.][0-9]+)+)/){$_.="\n"}'
    
  3. To extract one single version (the first one) in all cases

     perl -pe 'if(($v)=/([0-9]+([.][0-9]+)+)/){print"$v\n";exit}$_=""'
    

  1. The simplest

The first command line do not embed the final newline:

> gcc --version | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/'
5.3.1

> bash --version | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/'
4.2.46

> perl -pe '($_)=/([0-9]+([.][0-9]+)+)/' <<< 'A2 33. Z-0.1.2.3.4..5'
0.1.2.3.4

> uname -a | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/'
3.10.0

> lsb_release | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/'
4.1

Store the version within a shell variable:

> v=$( gcc --version | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/' )
> echo "GCC version is '$v'"
GCC version is '5.3.1'

But fails for multi version numbers:

> gwenview --version
Qt: 4.8.5
KDE Development Platform: 4.14.8
Gwenview: 4.10.4

> gwenview --version | perl -pe '($_)=/([0-9]+([.][0-9]+)+)/'
4.8.54.14.84.10.4
  1. Extract one version per line

> gwenview --version | perl -pe 'if(($_)=/([0-9]+([.][0-9]+)+)/){$_.="\n"}' 
4.8.4
4.14.8
4.10.4

> mvn --version
Apache Maven 3.0.4
Maven home: /usr/share/maven
Java version: 1.7.0_25, vendor: Oracle Corporation
Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre
Default locale: fr_FR, platform encoding: UTF-8
OS name: "linux", version: "3.11.0-13-generic", arch: "amd64", family: "unix"

> mvn --version | perl -pe 'if(($_)=/([0-9]+([.][0-9]+)+)/){$_.="\n"}'
3.0.4
1.7.0
3.11.0
  1. Extract the first version only

> gwenview --version | perl -pe 'if(($v)=/([0-9]+([.][0-9]+)+)/){print"$v\n";exit}$_=""'
4.8.5

Create an alias:

> alias extractor='perl -pe '\''if(($v)=/([0-9]+([.][0-9]+)+)/){print"$v\n";exit}$_=""'\'''

or if you use a recent bash:

> alias extractor=$'perl -pe \'if(($v)=/([0-9]+([.][0-9]+)+)/){print"$v\n";exit}$_=""\''

> v=$( mvn --version | extractor )
> echo "The maven version is '$v'"
The maven version is '3.0.4'

sed can handle this easily....

string="ata-c(kuh-small1);divider-bin-1.4.4;divider-conf-1.3.3-w(1,16);storage-bin-1.5.4;storage-conf-1.5.0-w(1);worker-bin-4.5.1;worker-conf-4.4.1-c(kuh)-win2"

echo $string | sed "s/^.*divider-bin-\([0-9.]*\).*/\1/"
1.4.4

There are a few other things you can do to tighten it up... such as stop grabbing the version number when you reach the ";"

sed "s/^.*divider-bin-\([^;]*\).*/\1/"