Transform multiple rows into columns with unique key
I am quite new to access so i was wondering if anyone could help me with the following requirement. Im trying to create a query to extract tables values combine them into the same row grouped at an unique identifier (ID).
I've looked at trying to use crosstab queries but they seem to more for performing calculations and also using a PIVOT function but also it doesn't seem to meet my requirement.
Any help really appreciated. Please see below re example of data.
My Data has the following Format
ID EMAIL COMMENTS
1 [email protected] <TEXT>
1 [email protected] <TEXT>
2 [email protected] <TEXT>
2 [email protected] <TEXT>
2 [email protected] <TEXT>
Desired Output
ID COL1 COL2 COL3 COL4 COLN COLN+1
1 [email protected] <TEXT> [email protected] <TEXT>
2 [email protected] <TEXT> [email protected] <TEXT> [email protected] <TEXT>
Solution 1:
Indoors today waiting out the hurricane, so I figured I'd create this custom solution. Answers for all of these steps are found elsewhere, but it was not simple to sort through all of the contrived solutions so I hope this answer is more useful all around.
The basic answer for changing the rows to columns is here. But unlike the this question's data, the sample data for that answer was already properly sequenced. As long as the column values [ID] and [Email] form unique pairs and have NO NULL values, then its possible to generate the proper sequencing using an aggregate subquery or a call to an Access aggregate function. To reduce the overall number of queries, I went ahead and generated the transformed column names in the same query. (If the values are not unique or have null values, the sequencing will be off and the final results will be missing some data.)
The second challenge here is that there are two columns that need to be transformed, but the Access SQL Transform statement (i.e. the Crosstab query) only allows one transform column per query. Creating two separate queries and then joining them is rather simple, but since the Crosstab queries generate a dynamic (i.e. undetermined) number of columns, it is not really possible to interleave the Email and Comments columns without manual effort of picking each column in order. Further, explicitly specifying which columns are in the query destroys the dynamic aspect of the crosstab queries and will either leave additional column out, or it will generate an error if the overall number of columns is reduced.
UPDATE: Just after posting the original solution (now labeled Solution 2), I realized that I could resolve the column interleaving problem by actually going backward one step... first generate more rows --one row for each email address and a separate row for each comment-- before the final transform placing them on the same row altogether again.
Solution 1
Save the following query and name it [Sequenced]. To facilitate the correct column ordering in the final output, I used the word 'Remark' instead of 'Comment' since it sorts after 'Email':
SELECT Data.ID, Data.Email, Data.Comments,
1 + DCount("[ID]","Data","[ID]=" & [ID] & " and [Email]<'" & Replace([Email],"'","''") & "'") AS SeqNum,
Format([SeqNum],"000") & ' Email' AS EmailColumn,
Format([SeqNum],"000") & ' Remark' AS CommentsColumn
FROM Data
ORDER BY Data.ID, Data.Email;
Save the following query and name it [Backwards]:
SELECT ID, EmailColumn AS ColumnName, Email AS [Value]
FROM Sequenced
UNION SELECT ID, CommentsColumn AS ColumnName, Comments AS [Value]
FROM Sequenced
ORDER BY [ID], [ColumnName];
Save the following query and name it [InterleavedCrosstab]:
TRANSFORM First(Backwards.Value) AS FirstOfValue
SELECT Backwards.ID
FROM Backwards
GROUP BY Backwards.ID
ORDER BY Backwards.ID, Backwards.ColumnName
PIVOT Backwards.ColumnName;
Solution 2
Save the following query and name it [Sequenced2]:
SELECT Data.ID, Data.Email, Data.Comments,
1 + DCount("[ID]","Data","[ID]=" & [ID] & " and [Email]<'" & Replace([Email],"'","''") & "'") AS SeqNum,
'Email' & Format([SeqNum],"000") AS EmailColumn,
'Comments' & Format([SeqNum],"000") AS CommentsColumn
FROM Data
ORDER BY Data.ID, Data.Email;
Save the following query and name it [EmailCrosstab]:
TRANSFORM First(Sequenced2.Email) AS FirstOfEmail
SELECT Sequenced2.ID
FROM Sequenced2
GROUP BY Sequenced2.ID
ORDER BY Sequenced2.ID
PIVOT Sequenced2.EmailColumn;
Save the following query and name it [CommentsCrosstab]:
TRANSFORM First(Sequenced2.Comments) AS FirstOfComments
SELECT Sequenced2.ID
FROM Sequenced2
GROUP BY Sequenced2.ID
ORDER BY Sequenced2.ID
PIVOT Sequenced2.CommentsColumn;
Finally, the most general result query will return ALL columns, but they will not be interleaved and there will duplicate [ID] columns:
SELECT EmailCrosstab.*,
CommentsCrosstab.*
FROM CommentsCrosstab INNER JOIN EmailCrosstab
ON CommentsCrosstab.ID = EmailCrosstab.ID;
Here's a prettified version but only has up to 3 email and comment columns:
SELECT EmailCrosstab.ID,
EmailCrosstab.Email001,CommentsCrosstab.Comments001,
EmailCrosstab.Email002,CommentsCrosstab.Comments002,
EmailCrosstab.Email003,CommentsCrosstab.Comments003
FROM CommentsCrosstab INNER JOIN EmailCrosstab
ON CommentsCrosstab.ID = EmailCrosstab.ID;
Solution 3
I had already typed up the following VBA procedure when I realized the query-only solution was rather easy, so here's a bonus alternative.
Public Sub CustomTransform()
'* This code assumes that the field values
'* [ID] and [Email] constitute a unique pair
'* and that there are NO NULL values.
Dim i As Integer, MaxIDRows As Integer
Dim IDSeq As Integer
Dim lastID As Long
Dim IDstring As String
Dim tbl As TableDef
Dim idx As Index
Dim db As Database
Dim rsSrc As Recordset2, rsResult As Recordset2
Const resultTable As String = "Customer Crosstab"
Set db = CurrentDb
MaxIDRows = db.OpenRecordset( _
"SELECT Max(Counter.Rows) AS MaxRows" & _
" FROM ( SELECT Count(Data.[ID]) AS [Rows]" & _
" FROM Data GROUP BY Data.[ID]) AS Counter" _
).Fields(0).Value
'* Column count <= 254 : ID + N * (Email + Comment columns)
If MaxIDRows = 0 Then
MsgBox "No data.", vbOKOnly Or vbExclamation, "No Data"
Exit Sub
ElseIf MaxIDRows >= 252 / 2 Then
MsgBox "Maximum number of columns exceeded.", _
vbOKOnly Or vbExclamation, "Maximum Columns Exceeded"
Exit Sub
End If
On Error Resume Next
db.TableDefs.Delete resultTable
Err.Clear
On Error GoTo 0
Set tbl = db.CreateTableDef(resultTable)
With tbl
' Create fields and append them to the new TableDef
' object. This must be done before appending the
' TableDef object to the TableDefs collection of the
' Northwind database.
.Fields.Append .CreateField("ID", dbLong)
For i = 1 To MaxIDRows
IDstring = Format(i, "000")
.Fields.Append .CreateField("Email" & IDstring, dbText, 255)
.Fields.Append .CreateField("Comments" & IDstring, dbText, 255)
Next
Set idx = .CreateIndex("Primary Key")
idx.Fields.Append idx.CreateField("ID")
idx.Primary = True
.Indexes.Append idx
End With
db.TableDefs.Append tbl
Set rsResult = db.OpenRecordset(resultTable, dbOpenTable)
Set rsSrc = db.OpenRecordset( _
"SELECT ID, Email, Comments" & _
" FROM Data" & _
" ORDER BY ID, Email")
lastID = -1
Do Until rsSrc.EOF
If rsSrc!id <> lastID Then
If lastID <> -1 Then
rsResult.Update
End If
IDSeq = 0
rsResult.AddNew
rsResult!id = rsSrc!id
End If
lastID = rsSrc!id
IDSeq = IDSeq + 1
IDstring = Format(IDSeq, "000")
rsResult.Fields("Email" & IDstring) = rsSrc!email
rsResult.Fields("Comments" & IDstring) = rsSrc!Comments
rsSrc.MoveNext
Loop
rsSrc.Close
If rsResult.EditMode <> dbEditNone Then
rsResult.Update
End If
rsResult.Close
Debug.Print "CustomTransform Done"
End Sub