How to use Power Query to import multiple CSV files into a single Excel worksheet as worksheet columns?
I have a recent version of Excel (but very little experience with Excel). From a folder containing many .csv
files, I want to import a few of the files as columns to a single Excel worksheet. I figured out how select the desired files by filename with a Power Query, and each file has the same number of rows. (They each also happen to have a single column, but I welcome a multicolumn solution.) A web search revealed that a Power Query could append rows from each file to produce a single column in a combined sheet, but that is not what I want. I also do not want to combine the columns into a summary column. As an example, if I have three .csv
files after filtering in the Power Query editor, I would like to import these into a single sheet with three columns (one for each file). Ideally the result would be a single table, but that is of secondary importance.
How to proceed?
To do this in Power Query, you will need some custom code.
I assume you are selecting the files from the Data => From Folders
dialog,
The code below starts at that point, and then adds code that runs after you have filtered the table to select the requisite files.
- Read in the csv data into a column of tables
- Expand the tables column which puts all the data into a single column.
- Then, the magic: use a custom function to Pivot the data using the file name
- Note that there are various arguments for the
Csv.Document
function which I have left out, not knowing anything about the data that you are accessing except that it is a csv document.
Read the code comments and also the reference in the custom function for more info.
M Code
let
Source = Folder.Files("C:\Users\ron\Desktop\PQ Test"),
// Select (filter) the particular files you want, then
//Remove all except the "link" and file name columns
#"Removed Other Columns" = Table.SelectColumns(Source,{"Content", "Name"}),
//convert each csv to a Table
csvTables = Table.AddColumn(#"Removed Other Columns", "csv", each Csv.Document([Content])),
//remove the column of links
remLinks = Table.RemoveColumns(csvTables,{"Content"}),
//Expand the tables into a single column
#"Expanded csv" = Table.ExpandTableColumn(remLinks, "csv", {"Column1"}, {"Column1"}),
//Pivot on the file name
pivot = fnPivotAll(#"Expanded csv","Name", "Column1")
in
pivot
Custom Function M Code reNAME this query: fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
Results
note that column headers are file names
The simplest solution is the manual copy-and-paste of columns, for the case where there are only a few files.
For other cases, the solution would be to use a VBA macro, using the old Visual Basic for Applications. For more information see the article Create or run a macro.
I have found two rather similar VBA macros for joining CSV files horizontally, all found within the same folder, written for a variable number of columns.
I list the macros below, but I have not tested any of them.
Macro 1
This comes from the post Import multiple csv files in one excel spread sheet - include file name in the first row. The macro here also renames the columns.
Sub GetFromCSVs()
Dim WB As Workbook
Dim R As Range
Dim bFirst As Boolean
Dim stFile As String
Dim stPath As String
stPath = "C:\Temp\CSVs\" ' change the path to suit
stPath = "C:\Users\gebruiker\Documents\Excel\"
stFile = Dir(stPath & "*.csv")
bFirst = True
Set R = Workbooks.Add(xlWorksheet).Sheets(1).Range("A1")
Do Until stFile = ""
Set WB = Workbooks.Open(stPath & stFile, ReadOnly:=True)
If bFirst Then
WB.Sheets(1).Range("A1").CurrentRegion.Columns(1).Copy Destination:=R.Offset(1)
Set R = R.Offset(, 1)
bFirst = False
End If
R.Value = Left(stFile, Len(stFile) - 4)
WB.Sheets(1).Range("A1").CurrentRegion.Columns(2).Copy Destination:=R.Offset(1)
Set R = R.Offset(, 1)
WB.Close SaveChanges:=False
stFile = Dir ' next file
Loop
End Sub
Macro 2
This macro comes from the article How to import multiple csv files in one excel spread sheet.
Sub GetFromCSVs()
Dim WB As Workbook
Dim R As Range
Dim bFirst As Boolean
Dim stFile As String
Dim stPath As String
stPath = "C:\Temp\CSVs\" ' change the path to suit
stFile = Dir(stPath & "*.csv")
bFirst = True
Set R = Workbooks.Add(xlWorksheet).Sheets(1).Range("A1")
Do Until stFile = ""
Set WB = Workbooks.Open(stPath & stFile, ReadOnly:=True)
If bFirst Then
WB.Sheets(1).Range("A1").CurrentRegion.Copy Destination:=R
Set R = R.Offset(, 2)
bFirst = False
Else
WB.Sheets(1).Range("A1").CurrentRegion.Columns(2).Copy Destination:=R
Set R = R.Offset(, 1)
End If
WB.Close saveChanges:=False
stFile = Dir() ' next file
Loop
End Sub