PySpark: How to fillna values in dataframe for specific columns?

apache-spark pyspark spark-dataframe

Solution 1:

df.fillna(0, subset=['a', 'b'])

There is a parameter named subset to choose the columns unless your spark version is lower than 1.3.1

Solution 2:

Use a dictionary to fill values of certain columns:

df.fillna( { 'a':0, 'b':0 } )

Related

Recent Posts

org.apache.kafka.common.errors.TimeoutException: Topic not present in metadata after 60000 ms

Why my code runs infinite time when i entered non integer type in c++ [duplicate]

How to retrieve Instagram username from User ID?

Serverless Framework - Variables resolution error

How do we access a file in github repo inside our azure databricks notebook