Grouping labels and concatenating their text values (like a pivot table)
I've got a spreadsheet with data like this:
Product | Attribute ----------+---------- Product A | Cyan Product B | Cyan Product C | Cyan Product A | Magenta Product C | Magenta Product B | Yellow Product C | Yellow Product A | Black Product B | Black
What I'd like to do group everything by Column A and have Column B be a comma-delimited list of values that share Column A in common, like so:
Product | Attribute ----------+-------------------------- Product A | Cyan,Magenta,Black Product B | Cyan,Yellow,Black Product C | Cyan,Magenta,Yellow,Black
Unfortunately, Pivot Tables only know how to work with number values, and the furthest it goes towards this is counting the number of times Column A occurs.
I was able to pull this off ultimately by importing the data into a MySQL database and using MySQL's GROUP_CONCAT(Attribute)
function in a query with a GROUP BY Product
clause, but after banging my head on my desk repeatedly while attempting to figure out an Excel solution.
For future reference, is this possible in Excel without macros? Whether it is or not, how would one pull this off?
Solution 1:
| A | B
---+-----------+-----------
1 | PRODUCT | ATTRIBUTE
2 | Product A | Cyan
3 | Product B | Cyan
4 | Product C | Cyan
5 | Product A | Magenta
6 | Product C | Magenta
7 | Product B | Yellow
8 | Product C | Yellow
9 | Product A | Black
10 | Product B | Black
Assuming row 1:1 is header row.
Sort by column A to group by product
-
Prepare data in comma-separated format in column C by entering into C2 the following formula and copy down to C3:C10.
=IF(A2<>A1, B2, C1 & "," & B2)
Identify useful rows by entering into D2
=A2<>A3
and copy down to D3:D10.-
Copy column C:D, then paste special as value (AltE-S-V-Enter). You will now get:
Product A Cyan Cyan FALSE Product A Magenta Cyan,Magenta FALSE Product A Black Cyan,Magenta,Black TRUE Product B Cyan Cyan FALSE Product B Yellow Cyan,Yellow FALSE Product B Black Cyan,Yellow,Black TRUE Product C Cyan Cyan FALSE Product C Magenta Cyan,Magenta FALSE Product C Yellow Cyan,Magenta,Yellow TRUE
Remove useless rows by filtering
FALSE
in column D with AutoFilter, then delete those rows.Finish. Column A & C is what you need.
Solution 2:
I know it is an old post but I had this challenge today. I used the PowerQuery add-in from Microsoft (NOTE: it is built into Excel 2016 by default).
- Select your table
- Under the POWER QUERY tab (or DATA in 2016), select "From Table"
- Click on the "Product" column
- under the Transform tab, select "Group By"
- On the View tab, make sure "Formula Bar" is checked
-
Change the formula
FROM:
= Table.Group(#"Changed Type", {"Product"}, {{"Count", each Table.RowCount(_), type number}})
TO:
= Table.Group(#"Changed Type", {"Product"}, {{"Attributes", each Text.Combine([Attribute], ", "), type text}})
Step 6 is leveraging the Power Query (M) Formulas to perform data manipulations not exposed through the basic operations provided in the UI. Microsoft has a full reference available online for all the advanced functions available in Power Query.
Solution 3:
Here are a couple of approaches, both "non-macro"...
With a small data set, after first sorting it by product (similar to GROUP BY Product), you could first copy the "Product" column, paste it elsewhere, then remove duplicates. Next, copy the "Attributes" for each product and "paste special, TRANSPOSE" next to each Product. Then concatenate a comma with each of your transposed attributes in a final results column. Admittedly all this "copy/paste special/transpose" would get old quickly if you have a long list of Products.
If you have lots of data, using a few formulas you can work your way to the final result, as shown below. The formulas in F2, G2, H2, I2 and N2 are indicated by the blue arrows. Copy those to the rows below as needed. Note that J2:L2 use the same formula as I2. Also, the F2 formula refers to a named range "Products" that spans the range A:A .