How do I check for equality using Spark Dataframe without SQL Query?

I had the same issue, and the following syntax worked for me:

df.filter(df("state")==="TX").show()

I'm using Spark 1.6.


There is another simple sql like option. With Spark 1.6 below also should work.

df.filter("state = 'TX'")

This is a new way of specifying sql like filters. For a full list of supported operators, check out this class.


You should be using where, select is a projection that returns the output of the statement, thus why you get boolean values. where is a filter that keeps the structure of the dataframe, but only keeps data where the filter works.

Along the same line though, per the documentation, you can write this in 3 different ways

// The following are equivalent:
peopleDf.filter($"age" > 15)
peopleDf.where($"age" > 15)
peopleDf($"age" > 15)