Deleting Duplicated Lines In TEXT File?

I am trying to cleanup a text and for some reason every line duplicated 3 times am i able to get ride of duplicates with regex or tricks or do you know a software which could do that , text file is like this

Party Started 10:17 (89/1/2)
Party Started 10:17 (89/1/2)
Party Started 10:17 (89/1/2)
Jessica At Dinner 17:54 (89/1/2)
Jessica At Dinner 17:54 (89/1/2)
Jessica At Dinner 17:54 (89/1/2)

How can i clean it up , and get ride of duplicated lines , it's about 69,587 lines

Solution 1:

You could use uniq, standard with bash. Just type:

uniq filewithdup.txt > filenew.txt

Solution 2:

Since you mention MS Office, I'll give you a native Windows solution.

If you are using Windows Vista or later, there's Windows PowerShell built in. You can use the command Get-Unique:

The Get-Unique cmdlet compares each item in a sorted list to the next item, eliminates duplicates, and returns only one instance of each item. The list must be sorted for the cmdlet to work properly.

Get-Content input.txt | Get-Unique | Set-Content output.txt

If it's not sorted, you can use Sort-Object -Unique (it also works on already sorted input, but do not use if you do not wish to remove duplicates with other lines between them).

Get-Content input.txt | Sort-Object -Unique | Set-Content output.txt

Solution 3:

Regex was tagged, so:

/(.+)\n\1/g

Deleting Duplicated Lines In TEXT File?

Solution 1:

Solution 2:

Solution 3:

Related

Recent Posts