I know little of wildcard usage in word.

wildcards - case sensitive

TO FIND

enter mark                       ^13
tab                              ^t
any lowercase letter             [a-z]
any uppercase letter             [A-Z]
any letter                       [A-z]
any digit                        [0-9]
any no. between 6–9              [6-9]
any letter between d–k           [d-k]
any word contains only letters   ([A-z]@>)
any word contains only digits    ([0-9]@>)
for grouping (for replace)       (   )
any character(s) between ...     (*)
any para                         ^13(*)^13

TO REPLACE

To replace first group   \1
To replace second group  \2
enter mark               ^p
tab                      ^t

I want to know more about this. Can anyone help me?


Solution 1:

Adapted from this article

Search Operators:

? - Any Character. (regex equivalent: .)

Example: d?g finds dig, dog, and dug

[-] - Character in Range. (regex equivalent: same)

Example: [a-m]end finds bend, fend, lend, and mend (the first character in this case is a, m, or any letter between)

< - Beginning of Word. (regex equivalent: ^)

Example: <tele finds telemarketing, telephone, and television

> - End of Word. (regex equivalent: $)

Example: tion> finds aggravation, inspiration, and institution

() - Expression. (regex equivalent: (?:))

Example: Lets you "nest" search expressions within a search term. For instance, <(pre)*(ed)> to find presorted and prevented

[!] - Not. (regex equivalent: [^])

Example: Finds the text but excludes the characters inside the brackets; t[!ae]ll finds till and toll but not tall and tell

{n} - Num of Occurrences. (regex equivalent: same)

Example: Finds the specified number of occurrences of the letter immediately before the {; to{2} finds too and tool but not to

{n,} - Num of Occurrences. (regex equivalent: same)

Example: Adding a , after the number tells Word to look for at least that number of occurrences; a {4,} finds four or more of the letter a in a row

{n,n} - Num of Occurrences. (regex equivalent: same)

Example: 10{2,3} finds 100 and 1000 but not 10

@ - Previous 1 or More. (regex equivalent: +)

Example: Finds one or more of the character immediately preceding the @; ^p@^t finds one or more paragraph break marks followed by a tab mark

* - 0 or More Characters. (regex equivalent: .*)

Example: Finds a word with one or more of the specified character, or words with none of the characters; des*t finds descent, desert, dessert, and destruct

[] - One of the specified characters. (regex equivalent: same)

Example: b[aeiou]t finds bat, bet, bit, and but

[!a-z] - Any single character with the exception of the ones in the range inside the bracket. (regex equivalent: [^a-z])

Example: m[!o-z]st finds mast and mist but not most or must

Solution 2:

This looks like a non-standard notation for regular expressions, frequently abbreviated as regex or regexp. This is a tremendously important tool to learn if you do any serious text processing. As you have already understood, regex allow for powerful pattern matching and substitution. The notation you provided resembles the standard greatly, so I could recognise it. There is an industry standard, POSIX, and a de-facto standard, Perl regex. The next paragraph is boring history, skip it if you want.

POSIX regex are used in many user-facing tools from POSIX-compliant operating systems (think Linux and its not-so-distant relatives). The canonical example is grep, which allows you to search for text in files. The text to match is specified in regex. Perl, a programming language, took the concept and extended it greatly for its purposes. Later a subset of this functionality was made it available as a code library, PCRE. All sorts of software embed this library, most notably text editors.

I can see a few differences to what I am used to in the notation above. Word's symbol for symbol for escape sequences is ^, normally it is \. »Only digits« is used often, so it has an abbreviation in Perl, namely \d is equivalent to the character class [0-9]; similarly, \w means word characters and is equivalent to [0-9a-zA-Z_]. Word's notation appears cumbersome against it. I do not know the other limitations of Word, so I encourage you to switch to a text editor with PCRE support.

You should learn first about whitespace matching (abbreviation \s) and repetition (+ and *). Perl's regex are explained in perlrequick, perlretut and perlre. To start experimenting right now, use the Flash based RegExr.