How to get name of dataframe column in pyspark?

Solution 1:

You can get the names from the schema by doing


Printing the schema can be useful to visualize it as well


Solution 2:

The only way is to go an underlying level to the JVM.


This is also how it is converted to a str in the pyspark code itself.

From pyspark/sql/

def __repr__(self):
    return 'Column<%s>' % self._jc.toString().encode('utf8')