Excel formula to prefix a column based on another column?

Is there a way to do so in Excel?

My Excel:

enter image description here

My desired output:

enter image description here


=TEXTJOIN(";",TRUE,"AVS"&LET(list,FILTERXML("<a><z>"&SUBSTITUTE(B1,CHAR(10),"</z><z>")&"</z></a>","//z"),TRIM(LEFT(list,IFERROR(FIND(" ",list),LEN(list))))))

This will take the text in B1, separate it by line breaks, pull all the text before the first space on each line, add AVS to the beginning, and concatenate them all with semicolons. Add this formula to A1 and copy / paste down.

  • FILTERXML("<a><z>"&SUBSTITUTE(B1,CHAR(10),"</z><z>")&"</z></a>","//z") is where most of the magic happens. The function is not explicitly meant for this purpose but you can use it to turn a delimited string into an array instead. This lets us split up the string into individual cells. All the XML tags like <a> and <z> are to make the string look like XML. This is why we used </z><z> earlier. (Reference)
  • LET(list,FILTERXML(~) defines list to be equal to the array of text we get out of the function above. It lets us reference it later by this name instead of pasting the entire function each time.
  • IFERROR(FIND(" ",list),LEN(list)) finds the first space in each item in the list. This will error if space isn't found at all, so we return the entire string in that case. If you just want the first 5 characters no matter want, you could replace this section with just the number 5 instead.
  • TRIM(LEFT(list,IFERROR(~))) pulls the first however many characters and then trims off any white space. You could account for this trimming inside the IFERROR(~) by subtracting one but, in the edge case where there is no space found, that would make the function return all but the last character in the string.
  • LET(list,FILTERXML(~),TRIM(~)), then, returns a neat list of just the first however many characters from each line of the original text.
  • TEXTJOIN(";",TRUE,"AVS"&LET(~)) adds "AVS" to the beginning of each of the items in that neat list and then combines the whole list into one string with a semicolon between each item.

Screenshot


Here's a way to do it with PowerQuery. It may seem like a long post, but I like to add a lot of images. It's point and click mostly and doesn't involve formulas, so it will handle large datasets more easily. As the source data grows, all you will need to do is right-click and refresh the query from the workbook. You won't need to repeat these steps nor expand any ranges in formulas.

Add a header to your data, then put your cursor in your data and use Data>Get & Transform Data>From Table/Range.

When the PowerQuery Editor opens, you'll see this:

enter image description here

Use Add Column>Index Column>From 1, so that you see this:

enter image description here

Now expand the Queries pane on the left of the Power Query Editor, right-click the query and choose 'Duplicate', so that you have two versions of the same query.

In the first query, do the following:

Now right-click the column header and use Split Column>By Delimiter, configured like this:

enter image description here

You'll see this:

enter image description here

Now use Add Column>Custom Column, configured like this:

enter image description here

You'll see this:

enter image description here

Now use Home>Group By, configured like this:

enter image description here

You'll see this:

enter image description here

This is because we tried to sum some text.

Go to Home>Advanced Editor. Change this:

#"Grouped Rows" = Table.Group(#"Added Custom", {"Index"}, {{"unique_ids", each List.Sum([unique_id]), type text}})

To this:

#"Grouped Rows" = Table.Group(#"Added Custom", {"Index"}, {{"unique_ids", each Text.Combine( [unique_id] , ";"), type text}})

Note that List.Sum was changed to Text.Combine.

You'll see this:

enter image description here

Now use Home>Merge Queries, configured like this:

enter image description here

Click the double-arrow at the top of the 'Table' column and configure like this:

enter image description here

You'll see this:

enter image description here

Now you can either remove the index or not, then use Home>Close & Load to get your results back into the workbook.

Here's the full query from the advanced editor for the first query:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"orig", type text}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"orig", Splitter.SplitTextByDelimiter("#(lf)", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "orig"),
    #"Added Custom" = Table.AddColumn(#"Split Column by Delimiter", "unique_id", each "AVS" & Text.Start([orig],5)),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Index"}, {{"unique_ids", each Text.Combine( [unique_id] , ";"), type text}}),
    #"Merged Queries" = Table.NestedJoin(#"Grouped Rows", {"Index"}, #"Table1 (2)", {"Index"}, "Table1 (2)", JoinKind.Inner),
    #"Expanded Table1 (2)" = Table.ExpandTableColumn(#"Merged Queries", "Table1 (2)", {"orig"}, {"orig"})
in
    #"Expanded Table1 (2)"

And for the second query:

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type),
    #"Changed Type" = Table.TransformColumnTypes(#"Added Index",{{"orig", type text}})
in
    #"Changed Type"