How to do sed like text replace with python?
I would like to enable all apt repositories in this file
cat /etc/apt/sources.list
## Note, this file is written by cloud-init on first boot of an instance
## modifications made here will not survive a re-bundle.
## if you wish to make changes you can:
## a.) add 'apt_preserve_sources_list: true' to /etc/cloud/cloud.cfg
## or do the same in user-data
## b.) add sources in /etc/apt/sources.list.d
#
# See http://help.ubuntu.com/community/UpgradeNotes for how to upgrade to
# newer versions of the distribution.
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick main
## Major bug fix updates produced after the final release of the
## distribution.
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates main
## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team. Also, please note that software in universe WILL NOT receive any
## review or updates from the Ubuntu security team.
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick universe
deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe
deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates universe
## N.B. software from this repository is ENTIRELY UNSUPPORTED by the Ubuntu
## team, and may not be under a free licence. Please satisfy yourself as to
## your rights to use the software. Also, please note that software in
## multiverse WILL NOT receive any review or updates from the Ubuntu
## security team.
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick multiverse
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-updates multiverse
## Uncomment the following two lines to add software from the 'backports'
## repository.
## N.B. software from this repository may not have been tested as
## extensively as that contained in the main release, although it includes
## newer versions of some applications which may provide useful features.
## Also, please note that software in backports WILL NOT receive any review
## or updates from the Ubuntu security team.
# deb http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse
# deb-src http://us-east-1.ec2.archive.ubuntu.com/ubuntu/ maverick-backports main restricted universe multiverse
## Uncomment the following two lines to add software from Canonical's
## 'partner' repository.
## This software is not part of Ubuntu, but is offered by Canonical and the
## respective vendors as a service to Ubuntu users.
# deb http://archive.canonical.com/ubuntu maverick partner
# deb-src http://archive.canonical.com/ubuntu maverick partner
deb http://security.ubuntu.com/ubuntu maverick-security main
deb-src http://security.ubuntu.com/ubuntu maverick-security main
deb http://security.ubuntu.com/ubuntu maverick-security universe
deb-src http://security.ubuntu.com/ubuntu maverick-security universe
# deb http://security.ubuntu.com/ubuntu maverick-security multiverse
# deb-src http://security.ubuntu.com/ubuntu maverick-security multiverse
With sed this is a simple sed -i 's/^# deb/deb/' /etc/apt/sources.list
what's the most elegant ("pythonic") way to do this?
Solution 1:
You can do that like this:
with open("/etc/apt/sources.list", "r") as sources:
lines = sources.readlines()
with open("/etc/apt/sources.list", "w") as sources:
for line in lines:
sources.write(re.sub(r'^# deb', 'deb', line))
The with statement ensures that the file is closed correctly, and re-opening the file in "w"
mode empties the file before you write to it. re.sub(pattern, replace, string) is the equivalent of s/pattern/replace/ in sed/perl.
Edit: fixed syntax in example
Solution 2:
Authoring a homegrown sed
replacement in pure Python with no external commands or additional dependencies is a noble task laden with noble landmines. Who would have thought?
Nonetheless, it is feasible. It's also desirable. We've all been there, people: "I need to munge some plaintext files, but I only have Python, two plastic shoelaces, and a moldy can of bunker-grade Maraschino cherries. Help."
In this answer, we offer a best-of-breed solution cobbling together the awesomeness of prior answers without all of that unpleasant not-awesomeness. As plundra notes, David Miller's otherwise top-notch answer writes the desired file non-atomically and hence invites race conditions (e.g., from other threads and/or processes attempting to concurrently read that file). That's bad. Plundra's otherwise excellent answer solves that issue while introducing yet more – including numerous fatal encoding errors, a critical security vulnerability (failing to preserve the permissions and other metadata of the original file), and premature optimization replacing regular expressions with low-level character indexing. That's also bad.
Awesomeness, unite!
import re, shutil, tempfile
def sed_inplace(filename, pattern, repl):
'''
Perform the pure-Python equivalent of in-place `sed` substitution: e.g.,
`sed -i -e 's/'${pattern}'/'${repl}' "${filename}"`.
'''
# For efficiency, precompile the passed regular expression.
pattern_compiled = re.compile(pattern)
# For portability, NamedTemporaryFile() defaults to mode "w+b" (i.e., binary
# writing with updating). This is usually a good thing. In this case,
# however, binary writing imposes non-trivial encoding constraints trivially
# resolved by switching to text writing. Let's do that.
with tempfile.NamedTemporaryFile(mode='w', delete=False) as tmp_file:
with open(filename) as src_file:
for line in src_file:
tmp_file.write(pattern_compiled.sub(repl, line))
# Overwrite the original file with the munged temporary file in a
# manner preserving file attributes (e.g., permissions).
shutil.copystat(filename, tmp_file.name)
shutil.move(tmp_file.name, filename)
# Do it for Johnny.
sed_inplace('/etc/apt/sources.list', r'^\# deb', 'deb')
Solution 3:
massedit.py (http://github.com/elmotec/massedit) does the scaffolding for you leaving just the regex to write. It's still in beta but we are looking for feedback.
python -m massedit -e "re.sub(r'^# deb', 'deb', line)" /etc/apt/sources.list
will show the differences (before/after) in diff format.
Add the -w option to write the changes to the original file:
python -m massedit -e "re.sub(r'^# deb', 'deb', line)" -w /etc/apt/sources.list
Alternatively, you can now use the api:
>>> import massedit
>>> filenames = ['/etc/apt/sources.list']
>>> massedit.edit_files(filenames, ["re.sub(r'^# deb', 'deb', line)"], dry_run=True)
Solution 4:
This is such a different approach, I don't want to edit my other answer.
Nested with
since I don't use 3.1 (Where with A() as a, B() as b:
works).
Might be a bit overkill to change sources.list, but I want to put it out there for future searches.
#!/usr/bin/env python
from shutil import move
from tempfile import NamedTemporaryFile
with NamedTemporaryFile(delete=False) as tmp_sources:
with open("sources.list") as sources_file:
for line in sources_file:
if line.startswith("# deb"):
tmp_sources.write(line[2:])
else:
tmp_sources.write(line)
move(tmp_sources.name, sources_file.name)
This should ensure no race conditions of other people reading the file. Oh, and I prefer str.startswith(...) when you can do without a regexp.
Solution 5:
If you are using Python3 the following module will help you: https://github.com/mahmoudadel2/pysed
wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py
Place the module file into your Python3 modules path, then:
import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)