How to get VLOOKUP to return the *last* match?
You can use an array formula to get data from the last matching record.
=INDEX(IF($A$1:$A$20="c",$B$1:$B$20),MAX(IF($A$1:$A$20="c",ROW($A$1:$A$20))))
Enter the formula using Ctrl+Shift+Enter.
This works like the INDEX
/MATCH
construction of a VLOOKUP
, but with a conditional MAX
used instead of MATCH
.
Note that this assumes that your table starts at row 1. If your data starts at a different row, you will need to adjust the ROW(...)
part by subtracting the difference between the top row and 1.
(Answering here as no separate question for sorted data.)
If the data were sorted, you could use VLOOKUP
with the range_lookup
argument TRUE
(or omitted, since it's the default), which is officially described for Excel as "search for approximate match".
In other words, for sorted data:
- setting the last argument to
FALSE
returns the first value, and - setting the last argument to
TRUE
returns the last value.
This is largely undocumented and obscure, but dates to VisiCalc (1979), and today holds at least in Microsoft Excel, LibreOffice Calc, and Google Sheets. It is ultimately due to the initial implementation of LOOKUP
in VisiCalc (and thence VLOOKUP
and HLOOKUP
), when there was no fourth parameter. The value is found by binary search, using inclusive left bound and exclusive right bound (a common and elegant implementation), which results in this behavior.
Technically this means that one starts the search with the candidate interval [0, n)
, where n
is the length of the array, and the loop invariant condition is that A[imin] <= key && key < A[imax]
(the left bound is <= the target, the right bound, which starts one after the end, is > the target; to validate, either check values at endpoints before, or check result after), and successively bisecting and choosing whichever side preserves this invariant: by exclusion one side will, until you get to an interval with 1 term, [k, k+1)
, and the algorithm then returns k
. This need not be an exact match (!): it's just the closest match from below. In case of duplicate matches, this results in returning the last match, as it requires that the next value be greater than the key (or the end of the array). In case of duplicates you need some behavior, and this is reasonable and easy to implement.
This behavior is stated explicitly in this old Microsoft Knowledge Base article (emphasis added): "XL: How to Return the First or Last Match in an Array" (Q214069):
You can use the LOOKUP() function to search for a value within an array of sorted data and return the corresponding value contained in that position within another array. If the lookup value is repeated within the array, it returns the last match encountered. This behavior is true for the VLOOKUP(), HLOOKUP(), and LOOKUP() functions.
Official documentation for some spreadsheets follow; in neither is the "last match" behavior stated, but it's implied in the Google Sheets documentation:
-
Microsoft Excel
TRUE assumes the first column in the table is sorted either numerically or alphabetically, and will then search for the closest value.
-
Google Sheets:
If
is_sorted
isTRUE
or omitted, the nearest match (less than or equal to the search key) is returned
If the values in the search array are sequential (i.e. you're looking for the largest value, such as the latest date), you don't even need to use the INDIRECT function. Try this simple code:
=MAX(IF($A$1:$A$20="c",$B$1:$B$20,)
Again, enter the formula using CTRL + SHIFT + ENTER