summing common column values by using pattern matching of column names using pyspark

example dataframe:

lens INV.NO INV.NO INV.NO
First 1 3 4
Second 2 5 3
First 3 7 2
Second 4 8 1

pattern matching (^INV.NO*) and pick up all 3 columns matching that and create new column plain_sum having the sum using pyspark.

lens INV.NO INV.NO INV.NO plain_sum
First 1 3 4 8
Second 2 5 3 10
First 3 7 2 12
Second 4 8 1 13

I don't see 'prod' in your column names. You can use a simple list comprehension to get your 'plain' columns in a list, and use reduce with add:

from functools import reduce
from operator import add
col_list = [c for c in df.columns if 'plain-' in c]
df = df.withColumn("plain_sum" ,reduce(add, [col(x) for x in col_list]))

df.show(truncate=False)

+------+-------------+-------------+-------------+---------+
|lens  |plain-dev 103|plain-pre 104|plain-rte 107|plain_sum|
+------+-------------+-------------+-------------+---------+
|First |1            |3            |4            |8        |
|Second|2            |5            |3            |10       |
|First |3            |7            |2            |12       |
|Second|4            |8            |1            |13       |
+------+-------------+-------------+-------------+---------+