Load csv file into a VBA array rather than Excel Sheet

Solution 1:

Okay, looks like you need two things: stream the data from the file, and populate a 2-D array.

I have a 'Join2d' and a 'Split2d' function lying around (I recall posting them in another reply on StackOverflow a while ago). Do look at the comments in the code, there are things you might need to know about efficient string-handling if you're handling large files.

However, it's not a complicated function to use: just paste the code if you're in a hurry.

Streaming the file is simple BUT we're making assumptions about the file format: are the lines in the file delimited by Carriage-Return characters or the Carriage-Return-and-Linefeed character pair? I'm assuming 'CR' rather than CRLF, but you need to check that.

Another assumption about the format is that numeric data will appear as-is, and string or character data will be encapsulated in quote marks. This should be true, but often isn't... And stripping out the quote marks adds a lot of processing - lots of allocating and deallocating strings - which you really don't want to be doing in a large array. I've short-cut the obvious cell-by-cell find-and-replace, but it's still an issue on large files.

If your file has commas embedded in the string values, this code won't work: and don't try to code up a parser that picks out the encapsulated text and skips these embedded commas when splitting-up the rows of data into individual fields, because this intensive string-handling can't be optimised into a fast and reliable csv reader by VBA.

Anyway: here's the source code: watch out for line-breaks inserted by StackOverflow's textbox control:

Running the code:

Note that you'll need a reference to the Microsoft Scripting Runtime (system32\scrrun32.dll)

Private Sub test()
    Dim arrX As Variant
    arrX = ArrayFromCSVfile("MyFile.csv")
End Sub

Streaming a csv file.

Note that I'm assuming your file is in the temp folder: C:\Documents and Settings[$USERNAME]\Local Settings\Temp You'll need to use filesystem commands to copy the file into a local folder: it's always quicker than working across the network.

Public Function ArrayFromCSVfile( _
    strName As String, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = ",", _
    Optional RemoveQuotes As Boolean = True _
) As Variant

    ' Load a file created by FileToArray into a 2-dimensional array
    ' The file name is specified by strName, and it is exected to exist
    ' in the user's temporary folder. This is a deliberate restriction:
    ' it's always faster to copy remote files to a local drive than to
    ' edit them across the network

    ' RemoveQuotes=TRUE strips out the double-quote marks (Char 34) that
    ' encapsulate strings in most csv files.

    On Error Resume Next

    Dim objFSO As Scripting.FileSystemObject
    Dim arrData As Variant
    Dim strFile As String
    Dim strTemp As String

    Set objFSO = New Scripting.FileSystemObject
    strTemp = objFSO.GetSpecialFolder(Scripting.TemporaryFolder).ShortPath
    strFile = objFSO.BuildPath(strTemp, strName)
    If Not objFSO.FileExists(strFile) Then  ' raise an error?
        Exit Function
    End If

    Application.StatusBar = "Reading the file... (" & strName & ")"

    If Not RemoveQuotes Then
        arrData = Join2d(objFSO.OpenTextFile(strFile, ForReading).ReadAll, RowDelimiter, FieldDelimiter)
        Application.StatusBar = "Reading the file... Done"
    Else
        ' we have to do some allocation here...

        strTemp = objFSO.OpenTextFile(strFile, ForReading).ReadAll
        Application.StatusBar = "Reading the file... Done"

        Application.StatusBar = "Parsing the file..."

        strTemp = Replace$(strTemp, Chr(34) & RowDelimiter, RowDelimiter)
        strTemp = Replace$(strTemp, RowDelimiter & Chr(34), RowDelimiter)
        strTemp = Replace$(strTemp, Chr(34) & FieldDelimiter, FieldDelimiter)
        strTemp = Replace$(strTemp, FieldDelimiter & Chr(34), FieldDelimiter)

        If Right$(strTemp, Len(strTemp)) = Chr(34) Then
            strTemp = Left$(strTemp, Len(strTemp) - 1)
        End If

        If Left$(strTemp, 1) = Chr(34) Then
            strTemp = Right$(strTemp, Len(strTemp) - 1)
        End If

        Application.StatusBar = "Parsing the file... Done"
        arrData = Split2d(strTemp, RowDelimiter, FieldDelimiter)
        strTemp = ""
    End If

    Application.StatusBar = False

    Set objFSO = Nothing
    ArrayFromCSVfile = arrData
    Erase arrData
End Function

Split2d Creates a 2-dimensional VBA array from a string:

Public Function Split2d(ByRef strInput As String, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = vbTab, _
    Optional CoerceLowerBound As Long = 0 _
    ) As Variant

    ' Split up a string into a 2-dimensional array.

    ' Works like VBA.Strings.Split, for a 2-dimensional array.
    ' Check your lower bounds on return: never assume that any array in
    ' VBA is zero-based, even if you've set Option Base 0
    ' If in doubt, coerce the lower bounds to 0 or 1 by setting
    ' CoerceLowerBound
    ' Note that the default delimiters are those inserted into the
    '  string returned by ADODB.Recordset.GetString

    On Error Resume Next

    ' Coding note: we're not doing any string-handling in VBA.Strings -
    ' allocating, deallocating and (especially!) concatenating are SLOW.
    ' We're using the VBA Join & Split functions ONLY. The VBA Join,
    ' Split, & Replace functions are linked directly to fast (by VBA
    ' standards) functions in the native Windows code. Feel free to
    ' optimise further by declaring and using the Kernel string functions
    ' if you want to.

    ' ** THIS CODE IS IN THE PUBLIC DOMAIN **
    '    Nigel Heffernan   Excellerando.Blogspot.com

    Dim i   As Long
    Dim j   As Long

    Dim i_n As Long
    Dim j_n As Long

    Dim i_lBound As Long
    Dim i_uBound As Long
    Dim j_lBound As Long
    Dim j_uBound As Long

    Dim arrTemp1 As Variant
    Dim arrTemp2 As Variant

    arrTemp1 = Split(strInput, RowDelimiter)

    i_lBound = LBound(arrTemp1)
    i_uBound = UBound(arrTemp1)

    If VBA.LenB(arrTemp1(i_uBound)) <= 0 Then
        ' clip out empty last row: a common artifact in data
         'loaded from files with a terminating row delimiter
        i_uBound = i_uBound - 1
    End If

    i = i_lBound
    arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

    j_lBound = LBound(arrTemp2)
    j_uBound = UBound(arrTemp2)

    If VBA.LenB(arrTemp2(j_uBound)) <= 0 Then
     ' ! potential error: first row with an empty last field...
        j_uBound = j_uBound - 1
    End If

    i_n = CoerceLowerBound - i_lBound
    j_n = CoerceLowerBound - j_lBound

    ReDim arrData(i_lBound + i_n To i_uBound + i_n, j_lBound + j_n To j_uBound + j_n)

    ' As we've got the first row already... populate it
    ' here, and start the main loop from lbound+1

    For j = j_lBound To j_uBound
        arrData(i_lBound + i_n, j + j_n) = arrTemp2(j)
    Next j

    For i = i_lBound + 1 To i_uBound Step 1

        arrTemp2 = Split(arrTemp1(i), FieldDelimiter)

        For j = j_lBound To j_uBound Step 1
            arrData(i + i_n, j + j_n) = arrTemp2(j)
        Next j

        Erase arrTemp2

    Next i

    Erase arrTemp1

    Application.StatusBar = False

    Split2d = arrData

End Function

Join2D Turns a 2-dimensional VBA array to a string:

Public Function Join2d(ByRef InputArray As Variant, _
    Optional RowDelimiter As String = vbCr, _
    Optional FieldDelimiter = vbTab, _
    Optional SkipBlankRows As Boolean = False _
    ) As String

    ' Join up a 2-dimensional array into a string. Works like the standard
    '  VBA.Strings.Join, for a 2-dimensional array.
    ' Note that the default delimiters are those inserted into the string
    '  returned by ADODB.Recordset.GetString

    On Error Resume Next

    ' Coding note: we're not doing any string-handling in VBA.Strings -
    ' allocating, deallocating and (especially!) concatenating are SLOW.
    ' We're using the VBA Join & Split functions ONLY. The VBA Join,
    ' Split, & Replace functions are linked directly to fast (by VBA
    ' standards) functions in the native Windows code. Feel free to
    ' optimise further by declaring and using the Kernel string functions
    ' if you want to.

    ' ** THIS CODE IS IN THE PUBLIC DOMAIN **
    '   Nigel Heffernan   Excellerando.Blogspot.com

    Dim i As Long
    Dim j As Long

    Dim i_lBound As Long
    Dim i_uBound As Long
    Dim j_lBound As Long
    Dim j_uBound As Long

    Dim arrTemp1() As String
    Dim arrTemp2() As String

    Dim strBlankRow As String

    i_lBound = LBound(InputArray, 1)
    i_uBound = UBound(InputArray, 1)

    j_lBound = LBound(InputArray, 2)
    j_uBound = UBound(InputArray, 2)

    ReDim arrTemp1(i_lBound To i_uBound)
    ReDim arrTemp2(j_lBound To j_uBound)

    For i = i_lBound To i_uBound

        For j = j_lBound To j_uBound
            arrTemp2(j) = InputArray(i, j)
        Next j

        arrTemp1(i) = Join(arrTemp2, FieldDelimiter)

    Next i

    If SkipBlankRows Then

        If Len(FieldDelimiter) = 1 Then
            strBlankRow = String(j_uBound - j_lBound, FieldDelimiter)
        Else
            For j = j_lBound To j_uBound
                strBlankRow = strBlankRow & FieldDelimiter
            Next j
        End If

        Join2d = Replace(Join(arrTemp1, RowDelimiter), strBlankRow, RowDelimiter, "")
        i = Len(strBlankRow & RowDelimiter)

        If Left(Join2d, i) = strBlankRow & RowDelimiter Then
            Mid$(Join2d, 1, i) = ""
        End If

    Else

        Join2d = Join(arrTemp1, RowDelimiter)

    End If

    Erase arrTemp1

End Function

Share and enjoy.

Solution 2:

Yes read it as a text file.

See this example

Option Explicit

Sub Sample()
    Dim MyData As String, strData() As String

    Open "C:\MyFile.CSV" For Binary As #1
    MyData = Space$(LOF(1))
    Get #1, , MyData
    Close #1
    strData() = Split(MyData, vbCrLf)
End Sub

FOLLOWUP

Like I mentioned below in the comments, AFAIK, there is no direct way of filling a 2d Array from a csv. You will have to use the code that I gave above and then split it per line and finally filling up a 2D array which can be cumbersome. Filling up a column is easy but if you specifically want say from Row 5 to Col 7 Data then it becomes cumbersome as you will have to check if there are sufficient columns/rows in the data. Here is a basic example to get Col B in a 2D Array.

NOTE: I have not done any error handling. I am sure you can take care of that.

Let's say our CSV File looks likes this.

enter image description here

When you run this code

Option Explicit

Const Delim As String = ","

Sub Sample()
    Dim MyData As String, strData() As String, TmpAr() As String
    Dim TwoDArray() As String
    Dim i As Long, n As Long

    Open "C:\Users\Siddharth Rout\Desktop\Sample.CSV" For Binary As #1
    MyData = Space$(LOF(1))
    Get #1, , MyData
    Close #1
    strData() = Split(MyData, vbCrLf)

    n = 0

    For i = LBound(strData) To UBound(strData)
        If Len(Trim(strData(i))) <> 0 Then
            TmpAr = Split(strData(i), Delim)
            n = n + 1
            ReDim Preserve TwoDArray(1, 1 To n)
            '~~> TmpAr(1) : 1 for Col B, 0 would be A
            TwoDArray(1, n) = TmpAr(1)
        End If
    Next i

    For i = 1 To n
        Debug.Print TwoDArray(1, i)
    Next i
End Sub

You will get the output as shown below

enter image description here

BTW, I am curious that since you are doing this in Excel, why not use inbuilt Workbooks.Open or QueryTables method and then read the range into a 2D array? That would be much simpler...