Excel prefixes or suffixes

I have a column in Excel that contains a comma delimited list:

Header
1, 61
61
1, 61, 161
5, 55

I would like to extract this data so that I can count the occurrences of each item so that I would have the following results:

Count of Items
1    |    2
5    |    1
55   |    1
61   |    3
161  |    1

I tried countif with "*" but its a mess because in this case I have prefixes or suffixes (1,61,161)

PLEASE HELP!


Option 1:

I would like to suggest UDF (User Defined Function) which not only counts occurrence of comma separated numbers but counts text also.

enter image description here

How it works:

  • Press Alt+F11 to get VB editor then Copy & Paste this code as Module.

    Option Explicit
    
    Function CountOccurrence(SearchRange As Range, Phrase As String) As Long
    
    Dim RE As Object, MC As Object
    Dim sPat As String
    Dim V As Variant
    Dim I As Long, J As Long
    
    V = SearchRange
    
    Set RE = CreateObject("vbscript.regexp")
    With RE
    .Global = True
    .MultiLine = True
    .ignorecase = True
    .Pattern = "(?:^|,\s*)" & Phrase & "(?:\s*,|$)"
    End With
    
    For I = 1 To UBound(V, 1)
    If RE.test(V(I, 1)) Then J = J + 1
    Next I
    
    CountOccurrence = J
    
    End Function
    
  • Enter Criteria in Range H18:H26, then enter this Formula in Cell I18 & fill it down.

=CountOccurrence($G$18:$G$24,H18)

Option 2:

Enter this Formula in Cell I18 & fill it down.

=SUMPRODUCT(--ISNUMBER(FIND(H18,$G$18:$G$24)))

Adjust cell references as needed.


  1. First you need to split your numbers into separate cells:
    • select your data
    • on data tab select "split to columns"
    • select "delimited", next
    • select data is separated by "comma", finish
  2. now you can use COUNTIF, e.g.
    =COUNTIF($A$2:$C$5,E2)

enter image description here


It looks like you already have a solution, but I'll toss in a non-VBA solution that handles dynamic data. It uses some helper columns that you can prepopulate to an arbitrarily large range. When there is no associated data, the cells will be blank. Some of helper columns could be eliminated; they're included to minimize repetition, but all of the helper columns can be hidden.

enter image description here

Your data is in column A. Column B determines the number of values in each entry based on the number of commas. The formula in B3 is:

=IF(ISBLANK(A3),"",LEN(A3)-LEN(SUBSTITUTE(A3,",",""))+1)

Column C is just the cumulative component count from column B. C2 is entered as 0. The formula in C3 is:

=IF(ISBLANK(A3),"",SUM(B3:B$3))

Populate columns B and C to as many rows as you might ever have data for. You can always extend those columns if needed.

Column E is just for convenience. It provides an index for the parsed values. You could hard-code the 1 and then add 1 to it for each successive row. For no good reason, I based it on the row number (the -2 in the formula below is to adjust the values to start at 1). Cells beyond the total number of values get a blank. My formula in E3:

=IF(ROW()-2>MAX($C$2:$C$10),"",ROW()-2)

Column F is just to avoid formula repetition. It pulls the relevant Column A entry from which the current component will be parsed. The formula in F3 is:

=IFERROR(OFFSET($A$2,MATCH(E3-1,$C$2:$C$10,1),0),"")

It finds the appropriate entry by comparing the component number in column E with the cumulative component count in column C.

Column G is the parsed component values, all in a single contiguous column that is easy to work with. The formula in G3 is:

=IFERROR(TRIM(MID(SUBSTITUTE(F3,",",REPT(" ",LEN(F3))),(E3-INDEX($C$2:$C$10,MATCH(E3-1,$C$2:$C$10,1))-1)*LEN(F3)+1,LEN(F3))),"")

This determines which element to parse from the column F entry by subtracting the cumulative element count of the last "completed" input record from the current element number.

Columns E through G should be propagated to enough rows to cover the anticipated number of component values (at least several times the number of data rows). Note that all the formulas above that refer to the range $C$2:$C$10 should be adjusted to include the full range of your data.

Now that you have all of the parsed elements in a nice column, there are a variety of ways to aggregate them and get the counts. I used a pivot table, which also gives you the list of unique values at the same time.

Select the full pre-populated range of column G for the pivot table. Use that field for the rows window and the values window (select count as the aggregation). The range will include blanks in the unused rows, so use the built-in filter to deselect blanks.

When the data changes, just refresh the pivot table and verify that any new element values are selected in the filter.