Compute the distance between two coordinates from different rows

I would like to create a new column in the data frame which consists of the distances between the location of the current transaction and the location of the last transaction.

I have the lat and long for each location and have used the haversine formula to compute the distance between two coordinates.

def haversine(lat1, lon1, lat2, lon2):

    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat / 2.0) ** 2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2.0) ** 2

    c = 2 * np.arcsin(np.sqrt(a))
    km = 6367 * c # Radius of earth in kilometers. (Use 3956 for miles)
return km

However, I am trying to adapt it so that it computes the difference from the last row (which was the previous location):

for i in range(0,df.shape[0]-1):
    df['Dist_last_trans'] = \
        haversine(df['merch_lat'].iloc[i-1], df['merch_long'].iloc[i-1],
                     df['merch_lat'].iloc[i], df['merch_long'].iloc[i])
   

but then the output is the same for every row, which is clearly wrong.

Any help would be greatly appreciated.


I have reproduced your case with a toy dataframe. The problem is that you are not specifying a row during assignment. This results in a column-wide assignment which modifies the Diff_last_trans column for all rows.

>>> import pandas as pd
>>> data = [['Alex',10],['Bob',12],['Clarke',13]]
>>> df = pd.DataFrame(data,columns=['Name','Diff_last_trans'])
>>> df['Diff_last_trans']
0    10
1    12
2    13
Name: Diff_last_trans, dtype: int64
>>> df['Diff_last_trans'] =3
>>> df['Diff_last_trans']
0    3
1    3
2    3
Name: Diff_last_trans, dtype: int64

Try to specify a row index with

>>> df.loc[1]['Diff_last_trans'] = 2
>>> df['Diff_last_trans']
0    3
1    2
2    3

in your case this would be used as

df.loc[i]['Diff_last_trans'] = \
        haversine(df['merch_lat'].iloc[i-1], df['merch_long'].iloc[i-1],
                     df['merch_lat'].iloc[i], df['merch_long'].iloc[i])