Is there a way to extract duplicate lines in Sublime Text?
I need to perform 2 operations in Sublime Text: extract unique lines and extract duplicate lines. For example for input
a
b
a
Extract duplicates should result in:
a
and Extract unique should result in:
b
Is there a built-in operation or a plugin to do that?
Solution 1:
You can find duplicate lines easily by running a Sort Lines
then searching for this regex that uses line boundary markers ^
and $
and the back reference \1
.
^(.+)$\n^\1$
Follow that with a Find All, Copy, Paste in a new tab, Permute Lines | Unique and you've extracted them.
Solution 2:
Unfortunately I don't have access to Sublime Text at the moment, so I'm not able to test this, but I believe something like the following might work for you:
- Sort the lines via the
Edit -> Sort Lines
command - Install the Highlight Duplicates plugin, and use it to highlight all the duplicate lines
- Cut the highlighted lines to the Clipboard, and paste them into a New File
- The lines that remain in the original file are your Extract Unique lines
- In the New File, select all the text, and remove duplicate lines via the
Edit -> Permute Lines -> Unique
command - The lines that remain in the New File are your Extract Duplicates lines
I'm not entirely sure that step #1 is actually necessary, but I included it just in case.
Solution 3:
I found the easiest way to do this with Sublime Text was to just sort lines (f5 on mac), permute lines > unique, then view the diff with git.
Solution 4:
Had the same problem (show me the dupes)... didn't find an easy Sublime-based answer and fell back to using Unix commands (my file had the data I wanted to find the duplicates of in columns 11-56):
cut -c 11-56 myfile.dat | sort | uniq -d
Posted here as an FYI to others.