I have a workbook with a Summary sheet and a Data Set sheet with a linked table containing 700k+ records of financial data on investments. (In my particular case, it is real estate properties; but this is the same scenario as tracking stocks and dividends.) The Summary Sheet contains one row per property (investment) and shows the amount (of money) originally injected (i.e., input or invested) per property, with the expectation that the monetary output (or return on investment) will at some point equal that original injection/investment amount. The Data Set shows daily production (i.e., monetary output) per property, with one row per property per date (i.e., has many rows for each property) and cannot be edited. It should be noted that the PropName column is unreliable (inaccurate/inconsistent), and so PropID should be used to correlate the two sheets.

I would like to have a formula in the "100% Recovery Date" column on the Summary sheet that will tell me the date that the total (cumulative) Output reaches or exceeds 100% of Input (i.e., the date that the investment has been recouped). I assumed that I would need a formula to do the following, though I may be wrong:

  1. Sum all production (return on investment) data for each Property, from oldest to newest (data set is already sorted this way),

    UNTIL it reaches 100% of the Original Input/investment (shown in Total Input column on Summary sheet),

  2. then return the date (located in a field in the Data Set sheet) when 100% recovery is met,

  3. If 100% recovery is not met, return "TBD".

For example, the desired 100%RecoveryDate for property 0764 (“Prop 1”) would be 6/7/2013 (from the sixth data row of the Data Set sheet) because 1.667000055+ 5.000999928 +1.667000055 + 3.334000111 + 5.000999928 (from the first six rows of the Data Set sheet) = 16.670000077, which is greater than 13 (Total Input for property 0764).

Any and all constructive feedback is welcome and most appreciated!

Sample of Summary Sheet:

PropID   PropName   TotalInput   CurrentOutput  100%RecoveryDate  ProgressToRecovery
 0764     Prop 1          13         71,820                              189%
 0736     Prop 2      30,711        134,746                              439%
 1680     Prop 3      25,014         52,887                              211%
 4078     Prop 4      29,494         36,705                              124%
 5226     Prop 5      43,983         41,438                               94%
 6427     Prop 6      28,786         50,855                              177%
 6683     Prop 7      19,231         60,501                              315%
 6739     Prop 8      28,350         48,229                              170%
 9153     Prop 9      37,888         28,125                               74%
 8020     Prop 10     31,429         41,094                              131%

Sample of Data Set:

PropID   PropName     Date          Output
 0764     Prop 1    6/1/2013      1.667000055
 0764     Prop 1    6/3/2013      5.000999928
 0764     Prop 1    6/7/2013      1.667000055
 0764     Prop 1    6/10/2013     3.334000111
 0736     Prop 2    6/19/2013   361
 0764     Prop 1    6/19/2013     5.000999928
 0764     Prop 1    6/22/2013     6.668000221
 0764     Prop 1    7/12/2013     3.334000111
 1680     Prop 3    7/17/2013   389
 0764     Prop 1    7/23/2013    10.00200081
 0736     Prop 2    8/2/2013    236
 4078     Prop 4    8/22/2013   236
 0764     Prop 1    8/25/2013     6.668000221
 0764     Prop 1    8/30/2013     3.334000111
 0764     Prop 1    8/31/2013     5.000999928
 0764     Prop 1    9/11/2013     1.667000055
 6427     Prop 6    9/15/2013  1018
 1680     Prop 3    9/16/2013   389
 0764     Prop 1    9/20/2013     6.668000221
 0764     Prop 1    9/22/2013    10.00200081
 0764     Prop 1    9/23/2013    10.00200081
 5226     Prop 5    9/23/2013   125

Create a dummy sheet.  Link it to the Data Set sheet as follows:

  • Click in cell A1, and either

    • type ='Data Set'!A1, or
    • type =, click on the “Data Set” tab, and click in cell A1 on that sheet

    and then type Enter or click on the checkmark to the left of the Formula Bar.

  • Click in the Name Box (to the left of the Formula Bar) and type a range that covers all the rows in the Data Set sheet, including (at a minimum) the “PropID”, “Date” and “Output” columns.  (These are columns A, C, and D in the example, and you say you have 700K+ rows, so you might type A1:D999999.)
  • Type Enter.  For illustration: illustration of Excel gymnastics
  • Click in the Formula Bar (which should say ='Data Set'!A1) and type Ctrl+Enter.

Then,

  • Enter =IF(SUMIFS($D$2:$D2,$A$2:$A2,$A2)>VLOOKUP($A2, Summary!A:C, 3,FALSE), ROW(), "") into cell G2 and type Enter.
  • Enter =MIN(IF(($A$2:$A$999998=$A2), ($G$2:$G$999998), 999999)) into cell H2 and type Ctrl+Shift+Enter, thus making it an “array formula”.
  • Drag these down to row 999999.

    Unfortunately, I can’t figure out how to get the Name Box trick to work for the array formula; you may just need to manually drag it.

  • Go to the Summary sheet.
  • In cell E2 (the first “100%RecoveryDate”), enter: =IF(ISERROR(VLOOKUP(A2,Dummy!A:H,8,FALSE)), "No Data!", IF(VLOOKUP(A2,Dummy!A:H,8,FALSE)=999999, "TBD", INDEX(Dummy!C:C,VLOOKUP(A2,Dummy!A:H,8,FALSE)))).
  • Format the cell as a date.
  • Center it if you want.
  • Then drag it down to the last row of data on the Summary sheet.
Note: You will need to modify the above if your Data Set ever reaches 1000000 (one million) rows.  I hope the places are obvious.