Using a column value as a parameter to a spark DataFrame function
One option is to use pyspark.sql.functions.expr
, which allows you to use columns values as inputs to spark-sql functions.
Based on @user8371915's comment I have found that the following works:
from pyspark.sql.functions import expr
df.select(
'*',
expr('posexplode(split(repeat(",", rpt), ","))').alias("index", "col")
).where('index > 0').drop("col").sort('letter', 'index').show()
#+------+---+-----+
#|letter|rpt|index|
#+------+---+-----+
#| X| 3| 1|
#| X| 3| 2|
#| X| 3| 3|
#| Y| 1| 1|
#| Z| 2| 1|
#| Z| 2| 2|
#+------+---+-----+