Python pandas find the largest value after date of each row
We can speed up the whole process with numpy
board cast then idxmax
, get the most recent values' id
greater than the current row , then assign it back
s = df['value'].values
idx = pd.DataFrame(np.triu(s-s[:,None])).gt(0).idxmax(1)
df['new'] = df['date'].reindex(idx.replace(0,-1)).values
df
Out[158]:
date value new
0 20200101 10 20200102.0
1 20200102 16 20200104.0
2 20200103 14 20200104.0
3 20200104 18 NaN