How to extract two numbers from two strings and calculate the difference in Bash?

I have a text file which contains (among others) the following lines:

{chapter}{{1}Einleitung}{27}{chapter.1}  
{chapter}{{2}Grundlagen}{35}{chapter.2}

How can I

  • get the 2 lines from this text file (they will always contain }Einleitung resp. }Grundlagen} and
  • extract the 2 page numbers (in this case 27 and 35),
  • calculate the difference 35-27 = 8 and
  • save the difference (8) of the two numbers in a variable

Perhaps with a bash script in Mac OS X?


Solution 1:

I do not know if Mac OS X has awk. If it does, this should work:

This should work:

DIFFERENZ=$(awk 'BEGIN {
  FS="[{}]+"
 } {
  if ($4=="Einleitung")
   EINLEITUNG=$5
  if ($4=="Grundlagen")
   GRUNDLAGEN=$5
 } END {
   print GRUNDLAGEN-EINLEITUNG
 }' textfile)

How it works:

  • FS="[{}]+" sets the field separator to any combination of curly brackets.
  • $4 refers to the third filed on the line (separated by curly brackets).
  • DIFFERENZ=$(...) evaluates the command ... and stores the ouput in DIFFERENZ.

Solution 2:

calc.awk:

BEGIN {
    FS="}{";           # split lines by '}{'
    e=0;               # set variable 'e' to 0
    g=0;               # set variable 'g' to 0
}

/Einleitung/ { e=$3; } # 'Einleitung' matches, extract the page
/Grundlagen/ { g=$3;}  # 'Grundlagen' matches, extract the page

END {
    print g-e;         # print difference
}

you can call it via:

$> awk -f calc.awk < in.txt

it will print 8. you could store that number in a bash-variable like this:

$> nr=`awk -f calc.awk < in.txt` 

if you need it more tight you could also rewrite calc.awk to be not a separate file but a one-line:

$> nr=`awk 'BEGIN{FS="}{";g=0;e=0}/Einleitung/{e=$3;}/Grundlagen/{g=$3;}END{print g-e;}' < in.txt`

Solution 3:

Pure bash 4.x, and shows the differences for every chapter:

unset page_last title_last page_cur title_cur
re='\{chapter\}\{\{[[:digit:]]+\}([^}]+)\}\{([[:digit:]]+)\}'
while read -r line; do
    if [[ $line =~ $re ]]; then
        title_cur=${BASH_REMATCH[1]} page_cur=${BASH_REMATCH[2]}
        diff=$((page_cur-page_last))
        echo "${diff} pages between \"${title_last}\" and \"${title_cur}\""
        title_last=$title_cur page_last=$page_cur
    fi
done < "$myfile"