How to invoke pandas.rolling.apply with parameters from multiple column?
Solution 1:
Define your own roll
We can create a function that takes a window size argument w
and any other keyword arguments. We use this to build a new DataFrame
in which we will call groupby
on while passing on the keyword arguments via kwargs
.
stride_tricks.as_strided
but it is succinct and in my opinion appropriate.
from numpy.lib.stride_tricks import as_strided as stride
import pandas as pd
def roll(df, w, **kwargs):
v = df.values
d0, d1 = v.shape
s0, s1 = v.strides
a = stride(v, (d0 - (w - 1), w, d1), (s0, s0, s1))
rolled_df = pd.concat({
row: pd.DataFrame(values, columns=df.columns)
for row, values in zip(df.index, a)
})
return rolled_df.groupby(level=0, **kwargs)
roll(df, 2).mean()
Open High Low Close
0 133.0350 133.2975 132.8250 132.930
1 132.9325 133.1200 132.6750 132.745
2 132.7425 132.8875 132.6075 132.710
3 132.7075 132.7875 132.6000 132.720
We can also use the pandas.DataFrame.pipe
method to the same effect:
df.pipe(roll, w=2).mean()
OLD ANSWER
Panel
has been deprecated. See above for updated answer.
see https://stackoverflow.com/a/37491779/2336654
define our own roll
def roll(df, w, **kwargs):
roll_array = np.dstack([df.values[i:i+w, :] for i in range(len(df.index) - w + 1)]).T
panel = pd.Panel(roll_array,
items=df.index[w-1:],
major_axis=df.columns,
minor_axis=pd.Index(range(w), name='roll'))
return panel.to_frame().unstack().T.groupby(level=0, **kwargs)
you should be able to:
roll(df, 2).apply(your_function)
Using mean
roll(df, 2).mean()
major Open High Low Close
1 133.0350 133.2975 132.8250 132.930
2 132.9325 133.1200 132.6750 132.745
3 132.7425 132.8875 132.6075 132.710
4 132.7075 132.7875 132.6000 132.720
f = lambda df: df.sum(1)
roll(df, 2, group_keys=False).apply(f)
roll
1 0 532.345
1 531.830
2 0 531.830
1 531.115
3 0 531.115
1 530.780
4 0 530.780
1 530.850
dtype: float64
Solution 2:
As your rolling window is not too large, I think you can also put them in the same dataframe then use the apply
function to reduce.
For example, with the dataset df
as following
Open High Low Close
Date
2017-11-07 258.97 259.3500 258.09 258.67
2017-11-08 258.47 259.2200 258.15 259.11
2017-11-09 257.73 258.3900 256.36 258.17
2017-11-10 257.73 258.2926 257.37 258.09
2017-11-13 257.31 258.5900 257.27 258.33
You can just add the rolling data to this dataframe with
window = 2
df1 = pd.DataFrame(index=df.index)
for i in range(window):
df_shifted = df.shift(i).copy()
df_shifted.columns = ["{}-{}".format(s, i) for s in df.columns]
df1 = df1.join(df_shifted)
df1
Open-0 High-0 Low-0 Close-0 Open-1 High-1 Low-1 Close-1
Date
2017-11-07 258.97 259.3500 258.09 258.67 NaN NaN NaN NaN
2017-11-08 258.47 259.2200 258.15 259.11 258.97 259.3500 258.09 258.67
2017-11-09 257.73 258.3900 256.36 258.17 258.47 259.2200 258.15 259.11
2017-11-10 257.73 258.2926 257.37 258.09 257.73 258.3900 256.36 258.17
2017-11-13 257.31 258.5900 257.27 258.33 257.73 258.2926 257.37 258.09
Then you can make an apply on it easily with all the rolling data you want with
df1.apply(AccumulativeSwingIndex, axis=1)
Solution 3:
Here's a workaround I came up with:
df['new_col'] = list(map(fn, df.rolling(2)))