Replace multiple regex string matches in a file
Solution 1:
This is one task for which regular expressions can really help:
import re
def replacemany(adict, astring):
pat = '|'.join(re.escape(s) for s in adict)
there = re.compile(pat)
def onerepl(mo): return adict[mo.group()]
return there.sub(onerepl, astring)
if __name__ == '__main__':
d = {'k1': 'zap', 'k2': 'flup'}
print replacemany(d, 'a k1, a k2 and one more k1')
Run as the main script, this prints a zap, a flup and one more zap
as desired.
This focuses on strings, not files, of course -- the replacement, per se, occurs in a string-to-string transformation. The advantage of the RE-based approach is that looping is reduced: all strings to be replaced are matched in a single pass, thanks to the regular expression engine. The re.escape
calls ensure that strings containing special characters are treated just as literals (no weird meanings;-), the vertical bars mean "or" in the RE pattern language, and the sub
method calls the nested onerepl
function for each match, passing the match-object so the .group()
call easily retrieves the specific string that was just matched and needs to be replaced.
To work at file level,
with open(final, 'w') as fin:
with open(initial, 'r') as ini:
fin.write(replacemany(mydict, ini.read()))
The with
statement is recommended, to ensure proper closure of the files; if you're stuck with Python 2.5, use from __future__ import with_statement
at the start of your module or script to gain use of the with
statement.