Is there a way to extract duplicate lines in Sublime Text?

I need to perform 2 operations in Sublime Text: extract unique lines and extract duplicate lines. For example for input

a
b
a

Extract duplicates should result in:

a

and Extract unique should result in:

b

Is there a built-in operation or a plugin to do that?


Solution 1:

You can find duplicate lines easily by running a Sort Lines then searching for this regex that uses line boundary markers ^ and $ and the back reference \1.

^(.+)$\n^\1$

Follow that with a Find All, Copy, Paste in a new tab, Permute Lines | Unique and you've extracted them.

Solution 2:

Unfortunately I don't have access to Sublime Text at the moment, so I'm not able to test this, but I believe something like the following might work for you:

  1. Sort the lines via the Edit -> Sort Lines command
  2. Install the Highlight Duplicates plugin, and use it to highlight all the duplicate lines
  3. Cut the highlighted lines to the Clipboard, and paste them into a New File
  4. The lines that remain in the original file are your Extract Unique lines
  5. In the New File, select all the text, and remove duplicate lines via the Edit -> Permute Lines -> Unique command
  6. The lines that remain in the New File are your Extract Duplicates lines

I'm not entirely sure that step #1 is actually necessary, but I included it just in case.

Solution 3:

I found the easiest way to do this with Sublime Text was to just sort lines (f5 on mac), permute lines > unique, then view the diff with git.

Solution 4:

Had the same problem (show me the dupes)... didn't find an easy Sublime-based answer and fell back to using Unix commands (my file had the data I wanted to find the duplicates of in columns 11-56):

cut -c 11-56 myfile.dat | sort | uniq -d

Posted here as an FYI to others.