Concatenate columns containing list values in Spark Dataframe
I have a dataframe (spark) which has 2 columns each with list values. I want to create a new column which concatenates the 2 columns (as well as the list values inside the column). For e.g.
Column 1 has a row value - [A,B]
Column 2 has a row value - [C,D]
"The output should be in a new column i.e. "
Column 3(newly created column) with row value - [A,B,C,D]
Note:- Column values have values stored in LIST
Please help me implement this with pyspark. Thanks
we can use an UDF as,
>>> from pyspark.sql import functions as F
>>> from pyspark.sql.types import *
>>> udf1 = F.udf(lambda x,y : x+y,ArrayType(StringType()))
>>> df = df.withColumn('col3',udf1('col1','col2'))