What is the most efficient/quickest way to loop through rows in VBA (excel)?

I know VBA in Excel isn't the quickest of things - but I need the most efficient (i.e. quickest) way to loop through a large sample of rows.

Currently I have:

For Each c In Range("$A$2:$A$" & Cells(Rows.count, "A").End(xlUp).row
    ' do stuff
Next c

The 'do stuff' includes insert a row here and there (so I need to keep the dynamic lookup of the range.)

Any ideas (looking at 10,000 rows+)?

EDIT I am already using

Application.ScreenUpdating = False
Application.Calculation = xlManual

Solution 1:

If you are just looping through 10k rows in column A, then dump the row into a variant array and then loop through that.

You can then either add the elements to a new array (while adding rows when needed) and using Transpose() to put the array onto your range in one move, or you can use your iterator variable to track which row you are on and add rows that way.

Dim i As Long
Dim varray As Variant

varray = Range("A2:A" & Cells(Rows.Count, "A").End(xlUp).Row).Value

For i = 1 To UBound(varray, 1)
    ' do stuff to varray(i, 1)
Next

Here is an example of how you could add rows after evaluating each cell. This example just inserts a row after every row that has the word "foo" in column A. Not that the "+2" is added to the variable i during the insert since we are starting on A2. It would be +1 if we were starting our array with A1.

Sub test()

Dim varray As Variant
Dim i As Long

varray = Range("A2:A10").Value

'must step back or it'll be infinite loop
For i = UBound(varray, 1) To LBound(varray, 1) Step -1
    'do your logic and evaluation here
    If varray(i, 1) = "foo" Then
       'not how to offset the i variable 
       Range("A" & i + 2).EntireRow.Insert
    End If
Next

End Sub

Solution 2:

EDIT Summary and reccomendations

Using a for each cell in range construct is not in itself slow. What is slow is repeated access to Excel in the loop (be it reading or writing cell values, format etc, inserting/deleting rows etc).

What is too slow depends entierly on your needs. A Sub that takes minutes to run might be OK if only used rarely, but another that takes 10s might be too slow if run frequently.

So, some general advice:

  1. keep it simple at first. If the result is too slow for your needs, then optimise
  2. focus on optimisation of the content of the loop
  3. don't just assume a loop is needed. There are sometime alternatives
  4. if you need to use cell values (a lot) inside the loop, load them into a variant array outside the loop.
  5. a good way to avoid complexity with inserts is to loop the range from the bottom up
    (for index = max to min step -1)
  6. if you can't do that and your 'insert a row here and there' is not too many, consider reloading the array after each insert
  7. If you need to access cell properties other than value, you are stuck with cell references
  8. To delete a number of rows consider building a range reference to a multi area range in the loop, then delete that range in one go after the loop

eg (not tested!)

Dim rngToDelete as range
for each rw in rng.rows
    if need to delete rw then

        if rngToDelete is nothing then
            set rngToDelete = rw
        else
            set rngToDelete = Union(rngToDelete, rw)
        end if

    endif
next
rngToDelete.EntireRow.Delete

Original post

Conventional wisdom says that looping through cells is bad and looping through a variant array is good. I too have been an advocate of this for some time. Your question got me thinking, so I did some short tests with suprising (to me anyway) results:

test data set: a simple list in cells A1 .. A1000000 (thats 1,000,000 rows)

Test case 1: loop an array

Dim v As Variant
Dim n As Long

T1 = GetTickCount
Set r = Range("$A$1", Cells(Rows.Count, "A").End(xlUp)).Cells
v = r
For n = LBound(v, 1) To UBound(v, 1)
    'i = i + 1
    'i = r.Cells(n, 1).Value 'i + 1
Next
Debug.Print "Array Time = " & (GetTickCount - T1) / 1000#
Debug.Print "Array Count = " & Format(n, "#,###")

Result:

Array Time = 0.249 sec
Array Count = 1,000,001

Test Case 2: loop the range

T1 = GetTickCount
Set r = Range("$A$1", Cells(Rows.Count, "A").End(xlUp)).Cells
For Each c In r
Next c
Debug.Print "Range Time = " & (GetTickCount - T1) / 1000#
Debug.Print "Range Count = " & Format(r.Cells.Count, "#,###")

Result:

Range Time = 0.296 sec
Range Count = 1,000,000

So,looping an array is faster but only by 19% - much less than I expected.

Test 3: loop an array with a cell reference

T1 = GetTickCount
Set r = Range("$A$1", Cells(Rows.Count, "A").End(xlUp)).Cells
v = r
For n = LBound(v, 1) To UBound(v, 1)
    i = r.Cells(n, 1).Value
Next
Debug.Print "Array Time = " & (GetTickCount - T1) / 1000# & " sec"
Debug.Print "Array Count = " & Format(i, "#,###")

Result:

Array Time = 5.897 sec
Array Count = 1,000,000

Test case 4: loop range with a cell reference

T1 = GetTickCount
Set r = Range("$A$1", Cells(Rows.Count, "A").End(xlUp)).Cells
For Each c In r
    i = c.Value
Next c
Debug.Print "Range Time = " & (GetTickCount - T1) / 1000# & " sec"
Debug.Print "Range Count = " & Format(r.Cells.Count, "#,###")

Result:

Range Time = 2.356 sec
Range Count = 1,000,000

So event with a single simple cell reference, the loop is an order of magnitude slower, and whats more, the range loop is twice as fast!

So, conclusion is what matters most is what you do inside the loop, and if speed really matters, test all the options

FWIW, tested on Excel 2010 32 bit, Win7 64 bit All tests with

  • ScreenUpdating off,
  • Calulation manual,
  • Events disabled.