Is there a command-line utility app which can find a specific block of lines in a text file, and replace it?

UPDATE (see end of question)

The text "search and replace" utility programs I've seen, seem to only search on a line-by-line basis...

Is there a command-line tool which can locate one block of lines (in a text file), and replace it with another block of lines.?

For example: Does the test file file contain this exact group of lines:

'Twas brillig, and the slithy toves
Did gyre and gimble in the wabe:
All mimsy were the borogoves,  
And the mome raths outgrabe. 

'Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!'

I want this, so that I can replace multiple lines of text in a file and know I'm not overwriting the wrong lines.

I would never replace "The Jabberwocky" (Lewis Carroll), but it makes a novel example :)

UPDATE:
..(sub-update) My following comment about reasons when not use sed are only in the context of; don't push any tool too far beyond its design intent (I use sed quite often, and consider it to be invaluable.)

I just now found an interesting web page about sed and when not to use it.
So, because of all the sed answers, I"ll post the link.. it is part of the sed FAQ on sourceforge

Also, I'm pretty sure there is some way diff can do the job of locating the block of text (once it's located, the replacement is quite straight foward; using head and tail) ... 'diff' dumps all the necessary data, but I haven't yet worked out how to filter it , ... (I'm still working on it)

This simple python script should do the task:


#!/usr/bin/env python

# Syntax: multiline-replace.py input.txt search.txt replacement.txt

import sys

inp = open(sys.argv[1]).read()
needle = open(sys.argv[2]).read()
replacement = open(sys.argv[3]).read()

sys.stdout.write(inp.replace(needle,replacement))

Like most other solutions, it has the disadvantage that the whole file is slurped into memory at once. For small text files, it should work well enough, however.

Approach 1: temporarily change newlines into something else

The following snippet swaps newlines with pipes, performs the replacement, and swaps separators back. The utility may choke if the line it sees it extremely long. You can choose any character to swap with as long as it's not in your search string.

<old.txt tr '\n' '|' |
sed 's/\(|\|^\)'\''Twas … toves|Did … Bandersnatch!'\''|/new line 1|new line 2|/g' |
tr '|' '\n' >new.txt

Approach 2: change the utility's record separator

Awk and perl support setting two or more blank lines as the record separator. With awk, pass -vRS= (empty RS variable). With Perl, pass -000 (“paragraph mode”) or set $,="". This is not helpful here though since you have a multi-paragraph search string.

Awk and perl also support setting any string as the record separator. Set RS or $, to any string that is not in your search string.

<old.txt perl -pe '
    BEGIN {$, = "|"}
    s/^'\''Twas … toves\nDid … Bandersnatch!'\''$/new line 1\nnew line 2/mg
' >new.txt

Approach 3: work on the whole file

Some utilities easily let you read the whole file into memory and work on it.

<old.txt perl -0777 -pe '
    s/^'\''Twas … toves\nDid … Bandersnatch!'\''$/new line 1\nnew line 2/mg
' >new.txt

Approach 4: program

Read the lines one by one. Start with an empty buffer. If you see the “'Twas” line and the buffer is empty, put it in the buffer. If you see the “Did gyre” and there's one line in the buffer, append the current line to the buffer, and so on. If you've just appended the “Bandersnatch line”, output the replacement text. If the current line didn't go into the buffer, print the buffer contents, print the current line and empty the buffer.

psusi shows a sed implementation. In sed, the buffer concept is built-in; it's called the hold space. In awk or perl, you'd just use a variable (perhaps two, one for the buffer contents and one for the number of lines).

I was sure there had to be a way to do this with sed. After some googling I came across this:

http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/

Based on that I ended up writing:

sed -n '1h;1!H;${;g;s/foo\nbar/jar\nhead/g;p;}' < x

Which correctly took the contents of x:

foo bar

And spit out:

jar head

Is there a command-line utility app which can find a specific block of lines in a text file, and replace it?

Approach 1: temporarily change newlines into something else

Approach 2: change the utility's record separator

Approach 3: work on the whole file

Approach 4: program

Related

Recent Posts