return constant numbers over partition
Solution 1:
You can use the lag
function to compare to the previous row in your window. Then you can do a rolling sum to get the desired restult
w = Window.partitionBy('user_name').orderBy('event_actions_order')
(df
.withColumn('change', f.when(f.lag('website').over(w) == f.col('website'), 0).otherwise(1))
.withColumn('test', f.sum('change').over(w))
.drop('change')
).show()
+-------------------+---------+-------+----+
|event_actions_order|user_name|website|test|
+-------------------+---------+-------+----+
| 1| user_1|foo.com| 1|
| 2| user_1|foo.com| 1|
| 3| user_1|bar.com| 2|
| 4| user_1|foo.com| 3|
+-------------------+---------+-------+----+