split geometric progression efficiently in Python (Pythonic way)
I am trying to achieve a calculation involving geometric progression (split). Is there any effective/efficient way of doing it. The data set has millions of rows. I need the column "Traded_quantity"
Marker | Action | Traded_quantity | ||
---|---|---|---|---|
2019-11-05 | 09:25 | 0 | 0 | |
09:35 | 2 | BUY | 3 | |
09:45 | 0 | 0 | ||
09:55 | 1 | BUY | 4 | |
10:05 | 0 | 0 | ||
10:15 | 3 | BUY | 56 | |
10:24 | 6 | BUY | 8128 |
turtle = 2 (User defined)
base_quantity = 1 (User defined)
def turtle_split(row):
if row['Action'] == 'BUY':
return base_quantity * (turtle ** row['Marker'] - 1) // (turtle - 1)
else:
return 0
df['Traded_quantity'] = df.apply(turtle_split, axis=1).round(0).astype(int)
Calculation
For 0th Row, Traded_quantity should be zero (because the Marker is zero)
For 1st Row, Traded_quantity should be (1x1) + (1x2) = 3 (Marker 2 will be split into 1 and 1, First 1 will be multiplied with the base_quantity>>1x1, Second 1 will be multiplied with the result from first 1 times turtle>>1x2), then we make a sum of these two numbers)
For 2nd Row, Traded_quantity should be zero (because the Marker is zero)
For 3rd Row, Traded_quantity should be (2x2) = 4(Marker 1 will be multiplied with the last split from row 1 time turtle i.e 2x2)
For 4th Row, Traded_quantity should be zero(because the Marker is zero)
For 5th Row, Traded_quantity should be (4x2)+(4x2x2)+(4x2x2x2) = 56(Marker 3 will be split into 1,1 and 1, First 1 will be multiplied with the last split from row3 times turtle >>4x2, Second 1 will be multiplied with the result from first 1 with turtle>>8x2), third 1 will be multiplied with the result from second 1 with turtle>>16x2) then we make a sum of these three numbers)
For 6th Row, Traded_quantity should be (32x2)+(32x2x2)+(32x2x2x2)+(32x2x2x2x2)+(32x2x2x2x2x2) = 8128
Whenever there will be a BUY, the traded quantity will be calculated using the last batch from Traded_quantity times turtle.
Turns out the code is generating correct Traded_quantity when there is no zero in Marker. Once there is a gap with a couple of zeros geometric progression will not help, I would require the previous fig(from Cache) to recalculate Traded_q. tried with lru_cache for recursion, didn't work.
This should work
def turtle_split(row):
global base_quantity
if row['Action'] == 'BUY':
summation = base_quantity * (turtle ** row['Marker'] - 1) // (turtle - 1)
base_quantity = base_quantity * (turtle ** (row['Marker'] - 1))*turtle
return summation
else:
return 0