Excel Help - Is it possible to return the column number of the 2nd match within a horizontal array?
I'm trying to return the column number that corresponds to the 2nd occurrence of zero in the following table:
B C D E F G
2 0 0 3 4 9
For example, the desired formula will return a 3 representing column D.
If that is not possible to return the column number within the array then is it possible to return a 4 instead (column number within the worksheet)?
I'm sure somebody will have a much neater way to do this, but here's a quick answer I've hacked together.
It requires you to specify the range (B1:G1 in your example) in some MATCH
formulas, and also to specify that range again in R1C1 format for use in the INDIRECT
function.
=MATCH(0,INDIRECT("R1C"&MATCH(0,B1:G1,0)+1&":R1C7",0),0)+MATCH(0,B1:G1,0)
Walkthrough
We start by finding the column holding your first matched value, that's this bit of the formula: MATCH(0,B1:G1,0)
.
Then once we know that the first match is in column C, we can run another MATCH
that looks in the range to the right of Column C - ie Column D onwards. That's what the INDIRECT
is doing. It uses R1C1 referencing so that it can take your first MATCH
as an input: INDIRECT("R1C"&MATCH(0,B1:G1,0)+1&":R1C7",0)
.
So then we run our second MATCH
like this: MATCH(0,INDIRECT("R1C"&MATCH(0,B1:G1,0)+1&":R1C7",0),0)
.
That will output the number of the column counting up from Column D (ie 1 in this case). So we need to add the number of columns we skipped on the left, to get our final number - this is just a repeat of the first MATCH: +MATCH(0,B1:G1,0)
.
Note - For flexibility use INDEX() or OFFSET() rather than INDIRECT()
Blm raises excellent points about the inflexibility of using INDIRECT
- read his answer proposing INDEX
or OFFSET
instead.
By popular demand... :-), plus there are a couple of problems with the INDIRECT
version, which I describe below.
First, two other solutions. The only substantive differences are how the second range to search is calculated. Andi used INDIRECT
, I use INDEX
and OFFSET
respectively. One other small difference you'll notice, instead of hardcoding the value to search for, I get it from a cell, which just makes it easier to test the formulas by varying the number being searched for. I've included all three here so it's easy to see them next to each other, the first is just copied from Andi's answer (with the hardcoded 0 replaced with the same cell reference I use in the others, B3):
=MATCH(B3,INDIRECT("R1C"&MATCH(B3,B1:G1,0)+1&":R1C7",0),0)+MATCH(B3,B1:G1,0)
=MATCH(B3,INDEX(B1:G1,0,MATCH(B3,B1:G1,0)+1):G1,0)+MATCH(B3,B1:G1,0)
=MATCH(B3,OFFSET(B1:G1,0,MATCH(B3,B1:G1,0)):G1,0)+MATCH(B3,B1:G1,0)
The second argument to the first (in the formula, it's matching the second occurrence of the search number) MATCH
is the difference:
INDIRECT("R1C"&MATCH(B3,B1:G1,0)+1&":R1C7",0)
INDEX(B1:G1,0,MATCH(B3,B1:G1,0)+1):G1
OFFSET(B1:G1,0,MATCH(B3,B1:G1,0)):G1
Andi described the first. The second and third work by calculating a reference to the cell after the match (the INDEX(...)
and OFFSET(...)
), and then extending that to a range that ends with the last cell (the :G1
part). So if searching for 0, MATCH(B3,B1:G1,0)
returns 2, so the INDEX
version becomes:
INDEX(B1:G1,0,2+1):G1
INDEX(B1:G1,0,3):G1
D1:G1
The OFFSET
version behaves similarly.
Although the three are on the face of it equivalent, there are actually two things that would make me not use the INDIRECT
version. The first is that because it calculates an absolute column, it doesn't work in slightly different circumstances. Let's say the final value in the table is changed from a 9 to a 3. Then if we place 3 in B3 to search for the second 3, the formulas should all return 6. However, the INDIRECT
version returns 5. Here's why. In that case, MATCH(B3,B1:G1,0)
returns 4, so the INDIRECT
becomes:
INDIRECT("R1C"&4+1&":R1C7",0)
INDIRECT("R1C"&5&":R1C7",0)
INDIRECT("R1C5:R1C7",0)
which is equivalent to E1:G1. However, the first cell of E1:G1 is the cell containing the first 3, so the MATCH
will just find the first 3 again and return 1, which gets added to the first MATCH
value (4), returning 5.
The issue is that the column calculation is absolute and doesn't take into account the fact that the table being searched starts in the second column. So even when searching for 0, the INDIRECT
formula is "wrong", in that it's finding the first 0 twice, but it looks like it works because the second 0 is immediately to the first 0's right, so the MATCH
looking for the second 0 returns 1, which happens to be the difference between the column numbers of the two 0's. When the difference between the column numbers of the two numbers being searched for isn't 1, then the INDIRECT
formula will return the wrong value.
That's fixable by making the column reference relative (and assuming that the formula is in the same column as the start of the table):
=MATCH(B3,INDIRECT("R1C["&MATCH(B3,B1:G1,0)&"]:R1C7",0),0)+MATCH(B3,B1:G1,0)
Now the INDIRECT
is:
INDIRECT("R1C["&4&"]:R1C7",0)
INDIRECT("R1C[4]:R1C7",0)
which is F1:G1, so now the second MATCH
will start searching in the cell to the right of the first match, and will return 2, resulting in a total of 6, which is correct.
That works, but there's still a problem, related to my parenthetical "assuming that the formula is in the same column as the start of the table". The INDIRECT
version (even the fixed one) is fairly fragile. For example, if I insert a new row above the table of numbers, the INDEX
and OFFSET
versions continue to work, because Excel automatically updates all the references. But because all but one of the references in the INDIRECT
version are text, Excel can't update them, so they'll continue to refer to the first row, which is now something else. If you remove the first column or add additional columns to the left of the table, it has similar problems. So in this case, I'd probably opt for the INDEX
or OFFSET
version, just to "future proof" the spreadsheet a bit.