Remove everything except regex match in Vim

My specific case is a text document that contains lots of text and IPv4 addresses. I want to remove everything except for the IP addresses.

I can use :vglobal to search for ([0-9]{1,3}\.){3}[0-9]{1,3} and remove all lines without IP addresses, but after that I only know how to search for the whole line and select the matching text. Is there an easier way.

In short, I'm looking for a way to do the following without using an external program (like grep):

grep --extended-regexp --only-matching --regexp="([0-9]{1,3}\.){3}[0-9]{1,3}"

Calling grep from vim may require adapting my regex (ex: removing \v). Using vim's incremental search shows me that I've got the pattern right, and I don't want to verify my regex in grep too.


Edit: Thanks to Peter, here's the function I now use. (C is the register I generally clobber in my functions.)

"" Remove all text except what matches the current search result
"" The opposite of :%s///g (which clears all instances of the current search).
function! ClearAllButMatches()
    let old = @c
    let @c=""
    %s//\=setreg('C', submatch(0), 'l')/g
    %d _
    put c
    0d _
    let @c = old
endfunction

Edit2: I made it a command that accepts ranges (but defaults to whole file).

"" Remove all text except what matches the current search result. Will put each
"" match on its own line. This is the opposite of :%s///g (which clears all
"" instances of the current search).
function! s:ClearAllButMatches() range
    let is_whole_file = a:firstline == 1 && a:lastline == line('$')

    let old_c = @c

    let @c=""
    exec a:firstline .','. a:lastline .'sub//\=setreg("C", submatch(0), "l")/g'
    exec a:firstline .','. a:lastline .'delete _'
    put! c

    "" I actually want the above to replace the whole selection with c, but I'll
    "" settle for removing the blank line that's left when deleting the file
    "" contents.
    if is_whole_file
        $delete _
    endif

    let @c = old_c
endfunction
command! -range=% ClearAllButMatches <line1>,<line2>call s:ClearAllButMatches()

Solution 1:

This effect can be accomplished by using sub-replace-special substitution and setreg() linewise

:let @a=""
:%s//\=setreg('A', submatch(0), 'l')/g
:%d _
:pu a
:0d _

or all in one line as such:

:let @a=""|%s//\=setreg('A', submatch(0), 'l')/g|%d _|pu a|0d _

Overview: Using a substitution to append each match into register "a" linewise then replace the entire buffer with the contents of register "a"

Explanation:

  1. let @a="" empty the "a" register that we will be appending into
  2. %s//\=setreg('A', submatch(0), 'l')/g substitute globally using the last pattern
  3. the \=expr will replace the pattern with the contents of the expression
  4. submatch(0) get the entire string of what just matched
  5. setreg('A', submatch(0), 'l') append (note: the capital "a") to @a the matched string, but linewise
  6. %d _ delete every line into the black hole register (aka @_)
  7. pu a put the contents of @a into the buffer
  8. 0d _ delete the first line

Concerns:

  • This will trash one of your registers. This example trashed @a
  • Uses the last search pattern. Although you can modify the substitute command with whatever pattern you want: %s/<pattern>/\=setreg('A', submatch(0), 'l')/g

For more help

:h :s\=
:h :let-@
:h submatch()
:h setreg()
:h :d
:h :p

Solution 2:

Assuming <ip> is your regex to match an IP address, I presume you could do something like :

:%s/.\{-}\(<ip>\).*/\1/g

where \1 is the first matched group (the address alone), and .\{-} used for non-greedy matching.

Solution 3:

In short, I'm looking for a way to do this without leaving vim

Easy enough:

:1,$! grep --extended-regexp --only-matching --regexp="([0-9]{1,3}\.){3}[0-9]{1,3}"

(though I actually voted up icecrime's substitution answer)

Solution 4:

:set nowrapscan
:let @a=""
gg0qac/\v(\d{1,3}\.){3}\d{1,3}<CR><CR><Esc>//e+1<CR>@aq@adG

Explanation:

  1. set nowrapscan disables ability to seek «past the end of file».
  2. let @a="": empty the a register.
  3. gg0: go to the first column (0) of the first line (gg).
  4. qa: start writing macros.
  5. c/{pattern}<CR>: change until pattern.
  6. c{motion}<CR><ESC>: replace text with newline (here {motion} is /{pat}<CR>).
  7. //e+1<CR>: search for last pattern, go one character left past its end (wraps around a newline, but if your lines looks like this: IP<newline>IP, there may be problems).
  8. @a: execute @a macros (it is empty when you are recording it, but when you are finished it will repeat steps 1-7 until it gets an error).
  9. q: end recording @a.
  10. @a: execute @a macros.
  11. dG: delete to the end of file.