Sed replace between 2 strings with special character
I have an XML file containing a code and to use it with xmllink I need to remove a link.
XML file containing:
<xml version="1.0" encoding="UTF-8" standalone="yes"?>
<PackingList xmlns="Link to somewhere#">
<morecode></morecode>
Using sed 'sed s/PackingList.*\>/PackingList/g' xmlfile
gives me the following result (on the 2nd line):
<PackingList#">
while it should be
<PackingList>
What am I doing wrong?
Solution 1:
Three things wrong:-
- The first quote in the
sed
command should be before thes/
option, not beforesed
itself - I presume this is a typing error. - The
>
character has no special meaning in regular expressions, and must not be escaped - the sequence\>
has special significance: it means end of word, and because.*
is "greedy" it matches the end of the last word on the line, hence the retention of the#"
. - If you match the source
>
, this will be included in the string to be replaced, so it must also appear in the replacement string.
So your edit command should be:
sed 's/PackingList.*>/PackingList>/g' xmlfile
This is similar to jherran's solution, but takes account of your original attempt at matching. It might be neater to match up to the trailing double-quote:
sed 's/PackingList.*"/PackingList/g' xmlfile
If you don't want to rely on greediness (and make it more readable), use:
sed 's/PackingList.*".*"/PackingList/g' xmlfile
Note that any subsequent XML tags on the same line may be deleted by any of the above: to avoid this, use:
sed 's/PackingList[^>]*"[^>]*"/PackingList/g' xmlfile
Solution 2:
Try this way:
sed 's/PackingList.*/PackingList>/g' xmlfile