DataSet in Panda: Increase Offset if next value is smaller then previous one
i have following DataSet initially:
value;date
100;2021-01-01
160;2021-02-01
250;2021-02-15
10;2021-03-01
90;2021-04-01
150;2021-04-15
350;2021-06-01
20;2021-07-01
100;2021-08-01
10;2021-08-10
Whenever the value "Value" drops (e.g. from 250 to 10 on 2021-03-01), I want to save the old value as offset.
When the value drops again (e.g. from 350 to 20 on 2021-07-01) I want to add the new offset to the old one (350 + 250).
Afterwards I want to add the offsets with the values, so that I get the following DataSet at the end:
value;date;offset;corrected_value
100;2021-01-01;0;100
160;2021-02-01;0;160
250;2021-02-15;0;250
10;2021-03-01;250;260
90;2021-04-01;250;340
150;2021-04-15;250;400
350;2021-06-01;250;600
20;2021-07-01;600;620
100;2021-08-01;600;700
10;2021-08-10;700;710
My current (terrible) approach whichis not working:
df['date'] = pd.to_datetime(df['date'])
df.index = df['date']
del df['date']
df.drop_duplicates(keep='first')
df['previous'] = df['value'].shift(1)
def pn(current, previous, offset):
if not pd.isna(previous):
if current < previous:
return previous + offset
return offset
df['offset'] = 0
df['offset'] = df.apply(lambda row: pn(row['value'], row['previous'], row['offset']), axis=1)
Your help is so much appreciated, thank you!
Cheers
Solution 1:
Find the desired positions in column 'value' with pd.Series.diff
and pd.Series.shift
. Fill with 0 and compute the cumsum
. Add the 'offset' column to 'value'
df['offset'] = df.value[df.value.diff().lt(0).shift(-1, fill_value=False)]
df['offset'] = df.offset.shift(1).fillna(0).cumsum().astype('int')
df['correct_value'] = df.offset + df.value
df
Output
value date offset correct_value
0 100 2021-01-01 0 100
1 160 2021-02-01 0 160
2 250 2021-02-15 0 250
3 10 2021-03-01 250 260
4 90 2021-04-01 250 340
5 150 2021-04-15 250 400
6 350 2021-06-01 250 600
7 20 2021-07-01 600 620
8 100 2021-08-01 600 700
9 10 2021-08-10 700 710