How can I sort the lines in a text file, by the length of each line, in Notepad++?
How Can I sort a text file by line length in notepad++? Is there any plugin available for the mentioned task?
In case that there is no plugin, What is the first and maybe second tutorial to read, In order to write the plugin Myself?
Solution 1:
This answer is inspired by a YouTube video. Updated to maintain original sort order, if that is important.
Notepad++ has a built-in TextFX tool that sorts selected lines alphabetically. This tool can be hijacked to sort by the length of the lines by placing spaces on the left of each line, and making sure that all the lines are the same length.
"The Zoo" comes alphabetically before "Their House" because the space is treated as a character and comes before "i". __X
(pretending the underscores are really spaces) will similarly come alphabetically before _XX
. The idea in this answer is to add spaces and line numbers so that __________092dog
will be sorted above _003alligator
.
I'll use the following as example data:
Lorem
ipsum
dolor
sit
amet
consectetur
adipisicing
Step 1. Add line numbers.
(Note added by barlop- a note for the reader regarding this step, we will not be sorting according to these line numbers, we're sorting according to the length of the lines. But the reason for adding the line numbers, is so we know the natural order, so that when for example, two+ lines are of equal length we can sort those lines according to that natural order)
Assuming your text file only has the data in it, place the text cursor (the vertical line) into the very first position of the file. Then in the Edit
menu select Column Editor...
(Alt+C). Choose "Number to Insert" and start with 1, increase by 1, and include leading zeros. Note that this will retain the original ordering when sorting from shortest string to longest string. Reverse all lines first if you want to sort longest to shortest.
1Lorem
2ipsum
3dolor
4sit
5amet
6consectetur
7adipisicing
Step 2. Pad all lines with leading spaces.
Place the text cursor (the vertical line) into the very first position of the file. Then in the Edit
menu select Column Editor...
(Alt+C). Insert enough spaces so that the shortest line of data will be padded out to the length of the longest line of data. If your shortest line has 4 characters, and your longest 44, then make sure you insert at least 40 spaces.
__________1Lorem
__________2ipsum
__________3dolor
__________4sit
__________5amet
__________6consectetur
__________7adipisicing
Step 3. Trim lines to a uniform length.
Use the following Regular Expression Find/Replace (Ctrl+H) to match the right-hand characters equalling or exceeding the length of your longest data line.
^.*(.{50})$
Replace all with $1
. That will trim everything except the right-most 50 characters of every line. If your data is longer (or short) than 50, adjust the {50}
in the Regular Expression.
(Note added by barlop- the idea here is the shortest lines have the most spaces at the beginning)
_______1Lorem
_______2ipsum
_______3dolor
_________4sit
________5amet
_6consectetur
_7adipisicing
Step 4. Sort the lines.
Select all of the text (Ctrl+A). Via the TextFX menu, go to Text FX > TextFX Tools > Sort lines case sensitive (at column)
. Your data should now be in length order, from shortest to longest. If you want them in order from longest to shortest, uncheck the Text FX > TextFX Tools > + Sort ascending
option before sorting. Note how line numbers are reversed as well.
_________4sit
________5amet
_______1Lorem
_______2ipsum
_______3dolor
_6consectetur
_7adipisicing
Step 5. Remove leading spaces.
Use another Regular Expression Find/Replace (Ctrl+H) to match the leading spaces.
^ *\d{4}
That's a space between the caret and asterisk. Replace all with nothing. That will remove all leading spaces and the inserted line numbers, if you had 4-digit line numbers. Replace the {4}
with the correct number of digits in your line numbers.
sit
amet
Lorem
ipsum
dolor
consectetur
adipisicing
MACRO
I recorded the above steps using Notepad++'s macro feature, and it doesn't work. I'm not sure which step fails, but I haven't diagnosed why. You could probably use AutoHotKey to automate this if you do it repeatedly.
Solution 2:
No I don't think there is. The closest is TextFx plugin but that's an character based sort not line length based. Your best bet is to throw the text into a spreadsheet and sort it there (using a separate computed column using the LEN()
function).
Solution 3:
You can use SQL in N++ in CSV files ! For example if you have :
col1;
hgfhfghfhg;
khjfhgfhfghfgh;
kjhfhgfhfhgfghfhf;
lkjgjghjhg;
lkjgjg;
, you can execute command select * from data order by length(col1) desc
to sort descending.
"data" means current file. "col1" - name of first (and last) column.
Unfortunately there is probably bug that doesn't allow abandon delimiter after lines in one-column text.
Solution 4:
Or if you happen to have linux and nedit:
ctrl-a
alt-r
perl -e 'print sort { length($a) <=> length($b) } <>'