Quicker way to get all unique values of a column in VBA?
Use Excel's AdvancedFilter function to do this.
Using Excels inbuilt C++ is the fastest way with smaller datasets, using the dictionary is faster for larger datasets. For example:
Copy values in Column A and insert the unique values in column B:
Range("A1:A6").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("B1"), Unique:=True
It works with multiple columns too:
Range("A1:B4").AdvancedFilter Action:=xlFilterCopy, CopyToRange:=Range("D1:E1"), Unique:=True
Be careful with multiple columns as it doesn't always work as expected. In those cases I resort to removing duplicates which works by choosing a selection of columns to base uniqueness. Ref: MSDN - Find and remove duplicates
Here I remove duplicate columns based on the third column:
Range("A1:C4").RemoveDuplicates Columns:=3, Header:=xlNo
Here I remove duplicate columns based on the second and third column:
Range("A1:C4").RemoveDuplicates Columns:=Array(2, 3), Header:=xlNo
Loading the values in an array would be much faster:
Dim data(), dict As Object, r As Long
Set dict = CreateObject("Scripting.Dictionary")
data = ActiveSheet.UsedRange.Columns(1).Value
For r = 1 To UBound(data)
dict(data(r, some_column_number)) = Empty
Next
data = WorksheetFunction.Transpose(dict.keys())
You should also consider early binding for the Scripting.Dictionary:
Dim dict As New Scripting.Dictionary ' requires `Microsoft Scripting Runtime` '
Note that using a dictionary is way faster than Range.AdvancedFilter on large data sets.
As a bonus, here's a procedure similare to Range.RemoveDuplicates to remove duplicates from a 2D array:
Public Sub RemoveDuplicates(data, ParamArray columns())
Dim ret(), indexes(), ids(), r As Long, c As Long
Dim dict As New Scripting.Dictionary ' requires `Microsoft Scripting Runtime` '
If VarType(data) And vbArray Then Else Err.Raise 5, , "Argument data is not an array"
ReDim ids(LBound(columns) To UBound(columns))
For r = LBound(data) To UBound(data) ' each row '
For c = LBound(columns) To UBound(columns) ' each column '
ids(c) = data(r, columns(c)) ' build id for the row
Next
dict(Join$(ids, ChrW(-1))) = r ' associate the row index to the id '
Next
indexes = dict.Items()
ReDim ret(LBound(data) To LBound(data) + dict.Count - 1, LBound(data, 2) To UBound(data, 2))
For c = LBound(ret, 2) To UBound(ret, 2) ' each column '
For r = LBound(ret) To UBound(ret) ' each row / unique id '
ret(r, c) = data(indexes(r - 1), c) ' copy the value at index '
Next
Next
data = ret
End Sub
PowerShell is a very powerful and efficient tool. This is cheating a little, but shelling PowerShell via VBA opens up lots of options
The bulk of the code below is simply to save the current sheet as a csv file. The output is another csv file with just the unique values
Sub AnotherWay()
Dim strPath As String
Dim strPath2 As String
Application.DisplayAlerts = False
strPath = "C:\Temp\test.csv"
strPath2 = "C:\Temp\testout.csv"
ActiveWorkbook.SaveAs strPath, xlCSV
x = Shell("powershell.exe $csv = import-csv -Path """ & strPath & """ -Header A | Select-Object -Unique A | Export-Csv """ & strPath2 & """ -NoTypeInformation", 0)
Application.DisplayAlerts = True
End Sub
it's funny because i've had to read these instructions over and over again, but it think i worked out a much faster way to do this:
Set data = ws.UsedRange
dim unique as variant
unique = WorksheetFunction.Unique(data)
And then you can do whatever you want with the unique
array such as iterating it:
For i = LBound(unique) To UBound(unique)
Range("Q" & i) = indexes(i, 1)
Next
Try this
Option Explicit
Sub UniqueValues()
Dim ws As Worksheet
Dim uniqueRng As Range
Dim myCol As Long
myCol = 5 '<== set it as per your needs
Set ws = ThisWorkbook.Worksheets("unique") '<== set it as per your needs
Set uniqueRng = GetUniqueValues(ws, myCol)
End Sub
Function GetUniqueValues(ws As Worksheet, col As Long) As Range
Dim firstRow As Long
With ws
.Columns(col).RemoveDuplicates Columns:=Array(1), header:=xlNo
firstRow = 1
If IsEmpty(.Cells(1, col)) Then firstRow = .Cells(1, col).End(xlDown).row
Set GetUniqueValues = Range(.Cells(firstRow, col), .Cells(.Rows.Count, col).End(xlUp))
End With
End Function
it should be quite fast and without the drawback NeepNeepNeep told about