Enable grep to exactly match the regular expression
Enable command 'grep' the return the regular expression matched exactly.
Command grep
will print a line when the line contains a string that matches an expression, which not handy to search specified content.
For instance, I have vocabulary files with formatting
**word**
1. Definition:
2. Usage
3. Others
I'd like to retrieve all the words to make a wordlist within files
grep '\*\*[^*]*\*\*'
It return bulks of content.
How to enable grep to catch only the 'word' ?
Use awk
.
This command will "extract" a bulk list of words assuming it's in the format you specified above:
awk '/\*\*/,/\*\*/ {print substr($0, 3, length($0)-4)}' <filename>
Example:
For this example, assume we have a text file called words.txt
with the following content:
**test**
1. Definition:
2. Usage
3. Others
**foo**
1. Definition:
2. Usage
3. Others
**bar**
1. Definition:
2. Usage
3. Others
$ awk '/\*\*/,/\*\*/ {print substr($0, 3, length($0)-4)}' words.txt
test
foo
bar
What it's Doing
/\*\*/,/\*\*/
This is the pattern range. I could have done this by looking for the first set of asterisks (/\*\*
) and been done, but I used a full range for completeness. One method is no more "right" than the other.{print substr($0, 3, length($0)-4)}'
This prints the subsring (of the string**word**
) starting at the 3rd character, with a length of the whole string (length($0)
) minus four characters (the four asterisks).<filename>
This is the input file to process theawk
command