Replace multiple regex string matches in a file

Solution 1:

This is one task for which regular expressions can really help:

import re

def replacemany(adict, astring):
  pat = '|'.join(re.escape(s) for s in adict)
  there = re.compile(pat)
  def onerepl(mo): return adict[mo.group()]
  return there.sub(onerepl, astring)

if __name__ == '__main__':
  d = {'k1': 'zap', 'k2': 'flup'}
  print replacemany(d, 'a k1, a k2 and one more k1')

Run as the main script, this prints a zap, a flup and one more zap as desired.

This focuses on strings, not files, of course -- the replacement, per se, occurs in a string-to-string transformation. The advantage of the RE-based approach is that looping is reduced: all strings to be replaced are matched in a single pass, thanks to the regular expression engine. The re.escape calls ensure that strings containing special characters are treated just as literals (no weird meanings;-), the vertical bars mean "or" in the RE pattern language, and the sub method calls the nested onerepl function for each match, passing the match-object so the .group() call easily retrieves the specific string that was just matched and needs to be replaced.

To work at file level,

with open(final, 'w') as fin:
  with open(initial, 'r') as ini:
    fin.write(replacemany(mydict, ini.read()))

The with statement is recommended, to ensure proper closure of the files; if you're stuck with Python 2.5, use from __future__ import with_statement at the start of your module or script to gain use of the with statement.