pandas DataFrame: Get previous month value where there are missing transaction and cannot shift()

Is there any way to implement this without generating dummy row ?

Here is my data source.

Group   Store   Month   Revenue
Group1  A   201611  10
Group1  A   201612  20
Group1  A   201701  30
Group1  B   201611  40
Group1  B   201701  60
Group2  C   201611  70
Group2  C   201612  80
Group2  C   201702  100

And here is desired output

Group Store Month Revenue Month_LM Revenue_LM
Group1  A   201611  10  201610  
Group1  A   201612  20  201611  10
Group1  A   201701  30  201612  20
Group1  B   201611  50  201610  
Group1  B   201701  70  201612  
Group1  B   201702  80  201701  70
Group2  C   201611  90  201610  
Group2  C   201612  100 201611  90
Group2  C   201702  120 201701  

The problem is at B, C (please notice missing 201612 of B and 201701 of C) If I do shift(), I will get value of previous month (in transaction, but not in business logic)

I managed to get Month_LM by using

def get_lm(month):
d = datetime.strptime(month+"01","%Y%m%d")
d = d - relativedelta(months=1)
return d.strftime("%Y%m")
df['LM'] = df['MONTH'].apply(lambda x:get_lm(str(x)))

But I have no idea how to get Revenue of "Month" based on "Month_LM" value ? df.lookup maybe ?

Thank you.


Solution 1:

I change the Month to datetime format, you would like it back, you can do,df.Month.dt.year*100+df.Month.dt.month, and in my solution I did not using column Month_LM

df.Month=pd.to_datetime(df.Month,format='%Y%m')
df['Rev']=df.groupby('Group').apply(lambda x :x.Revenue.shift()* (x.Month.dt.year*12+x.Month.dt.month).diff().eq(1)).replace(0,np.nan).values
df
Out[1080]: 
    Group Store      Month  Revenue  Month_LM   Rev
0  Group1     A 2016-11-01       10    201610   NaN
1  Group1     A 2016-12-01       20    201611  10.0
2  Group1     A 2017-01-01       30    201612  20.0
3  Group1     B 2016-11-01       50    201610   NaN
4  Group1     B 2017-01-01       70    201612   NaN
5  Group1     B 2017-02-01       80    201701  70.0
6  Group2     C 2016-11-01       90    201610   NaN
7  Group2     C 2016-12-01      100    201611  90.0
8  Group2     C 2017-02-01      120    201701   NaN