Variables for constructing patterns for use with sed

I have written a bash function to print sections of text enclosed between lines matching ## mode: org and ## # End of org in a file, with an empty line between sections. Before the ##, there can be any number of spaces.

Here is an example of a file to extract information from.

file: test.sh

## mode: org
## * Using case statement
## # End of org
case $arg in
 ("V")
   echo "Author"
   ;;
 (*)
   ## mode: org
   ## ** Silent Error Reporting Mode (SERM) in getopts
   ## *** Detects warnings without printing built-in messages.
   ## *** Enabled by colon {:} as first character in shortopts.
   ## # End of org
   break
   ;;
esac

The desired output would be

Code:

* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Here is the function I am using

capture-org ()
{
  local efile="$1"

  local begsec="## mode: org"
  local endsec="## # End of org"

  sed -n "/^[[:space:]]*${begsec}$/,/^[[:space:]]*${endsec}$/s/ *//p'" "$efile" |
   sed 's/^'"${begsec}"'$/\n'"${begsec}"'/' |
   sed '/^'"${begsec}"'$/d' | sed '/^'"${endsec}"'$/d' | cut -c 3-
}

I would like to simplify the function, using variables to construct patterns. But need some assistance to compile commands together such that I do not have to call sed so many times.

Perhaps using awk would be a better strategy.

capture-org ()
{
  local efile="$1"

  local begsec='^[[:space:]]*## mode: org$'
  local endsec='^[[:space:]]*## # End of org$'

  sed -n "/${begsec}/,/${endsec}/s/ *//p" "$efile" |
   sed 's/^## # End of org$/## # End of org\n/' |
   sed '/^## mode: org$/d' | sed '/^## # End of org$/d' | cut -c 3-
}

Solution 1:

I would indeed use something more sophisticated for this. Like awk:

$ awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' file
* Using case statement

** Silent Error Reporting Mode (SERM) in getopts
*** Detects warnings without printing built-in messages.
*** Enabled by colon {:} as first character in shortopts.

Or, using your last function there as a template:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

  awk -v start="$begsec" -v end="$endsec" \
    '{ 
        if($0~start){want=1; next} 
        if($0~end){want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}

One caveat that may be important is that this doesn't not require that the $begsec and $endsec be the only things on the line other than leading whitespace like your approach did, it simply searches for them anywhere on the line. I am assuming this isn't a very big deal considering what you are looking for, but if it is, you can use this instead which will remove whitespace at the beginning and end of the line before matching:

capture-rec ()
{

  local begsec='## mode: org'
  local endsec='## # End of org'

    awk -v start="$begsec" -v end="$endsec" \
    '{ 
        sub(/^[[:space:]]*/,"");
        sub(/[[:space:]]*$/,"");
        if($0==start){ want=1; next} 
        if($0==end){   want=0; print ""; next} 
        gsub(/\s*#+\s*/,""); 
     } want' "$1"
}

Variables for constructing patterns for use with sed

Solution 1:

Related

Recent Posts