"Remove Duplicates" feature does not remove all duplicates

When the Data Tools>Remove Duplicates command is used on the following list, some duplicates are not removed:

Key
46509
46509
51442
51442
73367
73367
80048
80048
88396
88396
102576
102576
108095
108095
118613
118613
127224
127224
137549
137549
140164
140164
151679
151679
152975
152975
154986
154986
167715
167715
169166
169166
170427
170427
176303
176303
187038
187038
199856
199856
206589
206589
225371
225371
229767
229767
230680
230680
245333
245333
253697
253697
256002
256002
262094
262094
273060
273060
275584
275584
285569
285569
301930
301930
304993
304993
305965
305965
307299
307299
311438
311438
314461
314461
340545
340545
344517
344517
355035
355035
358387
358387
359982
359982
363904
363904
368642
368642
378666
378666
378744
378744
382396
382396
393542
393542
394320
394320
395258
395258
402047
402047
421796
421796
427972
427972
442651
442651
442654
442654
445786
445786
447124
447124
456682
456682
467303
467303
470437
470437
476340
476340
480976
480976
481450
481450
498967
498967
516068
516068
517592
517592
522621
522621
524678
524678
529405
529405
532909
532909
535096
535096
540835
540835
553197
553197
566893
566893
567730
567730
570751
570751
572259
572259
581701
581701
589753
589753
597923
597923
602282
602282
612011
612011
643774
643774
649471
649471
654159
654159
661681
661681
680578
680578
709714
709714
728036
728036
728680
728680
740252
740252
750843
750843
763430
763430
777724
777724
777870
777870
779703
779703
787034
787034
789776
789776
793859
793859
798479
798479
812410
812410
816599
816599
820169
820169
821236
821236
833511
833511
833701
833701
845925
845925
854408
854408
882853
882853
891710
891710
906251
906251
929802
929802
982039
982039
984503
984503
984539
984539
987889
987889
997188
997188
1002869
1002869
1018493
1018493
1020735
1020735
1028780
1028780
1032460
1032460
1047306
1047306
1052136
1052136
1054482
1054482
1062633
1062633
1066509
1066509
1069500
1069500
1075339
1075339
1082282
1082282
1100206
1100206
1105407
1105407
1111409
1111409
1112175
1112175
1112407
1112407
1113400
1113400
1117243
1117243
1121732
1121732
1152461
1152461
1153727
1153727
1157252
1157252
1163384
1163384
1195324
1195324
1197373
1197373
1201828
1201828
1202786
1202786
1216328
1216328
1223541
1223541
1225848
1225848
1228595
1228595
1239281
1239281
1264378
1264378
1267348
1267348
1291510
1291510
1309891
1309891
1321106
1321106
1322886
1322886
1327690
1327690
1344145
1344145
1346250
1346250
1352350
1352350
1357920
1357920
1363386
1363386
1365046
1365046
1370383
1370383
1376666
1376666
1383834
1383834
1414472
1414472
1418967
1418967
1419811
1419811
1421587
1421587
1432639
1432639
1442854
1442854
1455236
1455236
1461896
1461896
1466376
1466376
1477670
1477670
1480154
1480154
1481801
1481801
1498209
1498209
1504776
1504776
1511904
1511904
1017854_447
1017854_447
1085496_19
1085496_19
1280601_77
1280601_77
1350662_135
1350662_135
141944_382
141944_382
1525893_483
1525893_483
455968_177
455968_177
469495_157
469495_157
641583_74
641583_74
704267_63
704267_63
709992_199
709992_199
819399_224
819399_224
832188_358
832188_358
857189_467
857189_467
898328_334
898328_334
91033_245
91033_245
960992_341
960992_341
999706_361
999706_361
BLA1048669
BLA1048669
BLA1052457
BLA1052457
BLA1055176
BLA1055176
BLA1060585
BLA1060585
BLA1067125
BLA1067125
BLA107601
BLA107601
BLA1087509
BLA1087509
BLA1097390
BLA1097390
BLA1122471
BLA1122471
BLA1126436
BLA1126436
BLA112694
BLA112694
BLA1148652
BLA1148652
BLA1149504
BLA1149504
BLA1155256
BLA1155256
BLA1222348
BLA1222348
BLA1225575
BLA1225575
BLA1227779
BLA1227779
BLA124071
BLA124071
BLA1247411
BLA1247411
BLA1258830
BLA1258830
BLA1281625
BLA1281625
BLA128165
BLA128165
BLA1305848
BLA1305848
BLA1307751
BLA1307751
BLA130926
BLA130926
BLA1334270
BLA1334270
BLA1335674
BLA1335674
BLA1340102
BLA1340102
BLA1345080
BLA1345080
BLA1353398
BLA1353398
BLA1369834
BLA1369834
BLA1424977
BLA1424977
BLA142669
BLA142669
BLA1429760
BLA1429760
BLA1443592
BLA1443592
BLA1447085
BLA1447085
BLA1453639
BLA1453639
BLA1456355
BLA1456355
BLA1458745
BLA1458745
BLA148663
BLA148663
BLA1494220
BLA1494220
BLA1522880
BLA1522880
BLA152562
BLA152562
BLA1548628
BLA1548628
BLA182644
BLA182644
BLA186707
BLA186707
BLA188073
BLA188073
BLA199127
BLA199127
BLA207148
BLA207148
BLA209939
BLA209939
BLA232119
BLA232119
BLA240725
BLA240725
BLA267275
BLA267275
BLA303829
BLA303829
BLA346044
BLA346044
BLA348268
BLA348268
BLA367863
BLA367863
BLA386897
BLA386897
BLA390799
BLA390799
BLA415724
BLA415724
BLA425291
BLA425291
BLA427093
BLA427093
BLA462355
BLA462355
BLA46821
BLA46821
BLA477637
BLA477637
BLA501111
BLA501111
BLA522126
BLA522126
BLA530906
BLA530906
BLA576525
BLA576525
BLA639361
BLA639361
BLA644530
BLA644530
BLA65754
BLA65754
BLA670003
BLA670003
BLA687525
BLA687525
BLA709367
BLA709367
BLA714281
BLA714281
BLA719328
BLA719328
BLA736457
BLA736457
BLA759669
BLA759669
BLA76056
BLA76056
BLA767666
BLA767666
BLA780853
BLA780853
BLA793029
BLA793029
BLA796639
BLA796639
BLA803743
BLA803743
BLA815298
BLA815298
BLA828243
BLA828243
BLA864030
BLA864030
BLA864110
BLA864110
BLA882892
BLA882892
BLA885665
BLA885665
BLA897478
BLA897478
BLA903861
BLA903861
BLA91627
BLA91627
BLA925523
BLA925523
BLA928243
BLA928243
BLA932395
BLA932395
BLA935341
BLA935341
BLA940705
BLA940705
BLA966386
BLA966386

A message is displayed indicating 293 duplicate values were removed and 307 unique values remain. 7 duplicates are not removed:

1280601_77
1525893_483
455968_177
469495_157
641583_74
704267_63
709992_199

The initial list consist of Excel-ordered values of:

  1. numbers between 40,000 and 1,500,000
  2. numbers in the same range with and added _ and a 1-999 suffix
  3. numbers in the same range prefixed with BLA

Each value is present exactly twice. There are no formula involved. I already checked for extra spaces.

Immediately reapplying the Remove Duplicates command does not remove the remaining duplicate values.
Reapplying the Remove Duplicates command after a reverse-sort seems to remove the remaining duplicates.
Reverse-sorting before using the first Remove Duplicates changes the result (213 duplicate values removed and 387 unique values remained), only in the non-suffixed non-prefixed group.

Can you reproduce this bug in Excel 2007 like I do? What triggers it? Could it the composite (numbers and text) data set?

I'm not looking for alternatives. I want to know when not to use the feature and revert to already known (Excel 2003 era) alternatives.


I've tried a Right(a1;x) function on twelve copies of your list (with records lentgh from 1 to 12 caracters from the right). The idea was to see if it has something to do with the lentgh of the records.
Once the formula was extended down I copied/pasted the values only before using the remove duplicate command. It worked fine every time !(300 duplicated values removed. Rest 300 values) So though I stil don't know what's happening, a solution to your problem is to use Right(a1,12)on your list and copy/paste values only from the resulting list. Just copy/paste the values only from the first list does not work. You have to use a formula to keep the solution inside Excel (or else you can copy/paste in a text file as mentionned in your comment).


I have run across this issue in every version of Excel, and I have yet to see an explanation as to why this happens. Regardless, there is a procedure for correcting this aberration. The following example is from a situation where I copy a range of part numbers ( more than 30,000 records) from one report into a separate worksheet, then remove the duplicates(picks up after the data has been copied to a new tab):

1) Always sort your data (smallest-to-largest) first. If you have more than one field, make sure you sort by the field from which you will be removing the duplicates:

'Determine Last Row of Data To Set Range
Cells(1,1).Select
Selection.End(xlDown).Select
lastRow = ActiveCell.Row

'Define And Select Ranges For Duplicate Removal
Set mNumRange = Range(Cells(1, 1), Cells(lastRow, 1))

'Sort Range For Futile Attempt At Duplicate Removal
mNumRange.Select
ActiveWorkbook.Worksheets("TAB_3").Sort.SortFields.Clear
ActiveWorkbook.Worksheets("TAB_3").Sort.SortFields.Add Key:=mNumRange, _
               SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
With ActiveWorkbook.Worksheets("TAB_3").Sort
   .SetRange mNumRange
   .Header = xlYes
   .MatchCase = False
   .Orientation = xlTopToBottom
   .Apply
End With

2) Run "Data -> Remove Duplicates" on the selected field. This will remove the vast majority of the duplicates:

'Partially Remove Duplicates <sigh>
mNumRange.Select
With Selection
   .RemoveDuplicates Columns:=1, Header:=xlYes
End With
Cells(1, 1).Select

3) Redefine the new range (the remaining data will still be sorted smallest-to-largest)

'Redefine Range After Partial Duplicate Removal
Selection.End(xlDown).Select
lastRow = ActiveCell.Row
Set mNumRange = Range(Cells(2, 1), Cells(lastRow, 1)) 

4) Now, run a row-by-row procedure to remove the remaining duplicates. As the data is sorted smallest-to-largest, any duplicates will be in adjoining rows:

'Execute Row-By-Row Duplicate Removal Procedure <sigh>
rowCounter = 2
deleteCounter = 0
Cells(1, 1).Select
Application.ScreenUpdating = False
Do
Cells(rowCounter, 1).Select
   If ActiveCell.Value = "" Then
      Cells(1, 1).Select
      Exit Do
   End If
   If ActiveCell.Value = ActiveCell.Offset(-1, 0).Value Then
      Selection.EntireRow.Delete
      deleteCounter = deleteCounter + 1
      rowCounter = rowCounter - 1
   End If
rowCounter = rowCounter + 1
Loop
Application.ScreenUpdating = True
MsgBox deleteCounter & " ADDITIONAL DUPLICATES DELETED", vbOKOnly

The MsgBox will let you know how many additional duplicates were deleted.