using regex to extract string [duplicate]

Lets say I have:

a = r''' Example
This is a very annoying string
that takes up multiple lines
and h@s a// kind{s} of stupid symbols in it
ok String'''

I need a way to do a replace(or just delete) and text in between "This" and "ok" so that when I call it, a now equals:

a = "Example String"

I can't find any wildcards that seem to work. Any help is much appreciated.


Solution 1:

You need Regular Expression:

>>> import re
>>> re.sub('\nThis.*?ok','',a, flags=re.DOTALL)
' Example String'

Solution 2:

Another method is to use string splits:

def replaceTextBetween(originalText, delimeterA, delimterB, replacementText):
    leadingText = originalText.split(delimeterA)[0]
    trailingText = originalText.split(delimterB)[1]

    return leadingText + delimeterA + replacementText + delimterB + trailingText

Limitations:

  • Does not check if the delimiters exist
  • Assumes that there are no duplicate delimiters
  • Assumes that delimiters are in correct order

Solution 3:

The DOTALL flag is the key. Ordinarily, the '.' character doesn't match newlines, so you don't match across lines in a string. If you set the DOTALL flag, re will match '.*' across as many lines as it needs to.