Newbetuts
.
New posts in apache-spark-sql
how to filter out a null value from spark dataframe
scala
apache-spark
apache-spark-sql
spark-dataframe
Spark UDAF with ArrayType as bufferSchema performance issues
scala
performance
apache-spark
apache-spark-sql
user-defined-functions
How to iterate over a batch DF parallely in pyspark
apache-spark
pyspark
apache-spark-sql
How to use Column.isin with list?
scala
apache-spark
apache-spark-sql
How to pivot Spark DataFrame?
dataframe
apache-spark
pyspark
apache-spark-sql
pivot
Better way to convert a string field into timestamp in Spark
scala
apache-spark
apache-spark-sql
How to construct Dataframe from a Excel (xls,xlsx) file in Scala Spark?
excel
scala
apache-spark
apache-spark-sql
spark-excel
Spark Scala - How to explode a column into multiple rows in spark scala
scala
apache-spark
apache-spark-sql
GroupBy column and filter rows with maximum value in Pyspark
python
apache-spark
pyspark
apache-spark-sql
How to find count of Null and Nan values for each column in a PySpark dataframe efficiently?
apache-spark
pyspark
apache-spark-sql
Removing duplicates from rows based on specific columns in an RDD/Spark DataFrame
apache-spark
apache-spark-sql
pyspark
How to write unit tests in Spark 2.0+?
scala
unit-testing
apache-spark
junit
apache-spark-sql
Cast column containing multiple string date formats to DateTime in Spark
python
apache-spark
pyspark
apache-spark-sql
Spark DataFrame: Computing row-wise mean (or any aggregate operation)
python
apache-spark
apache-spark-sql
pyspark
Pyspark filter dataframe by columns of another dataframe
python-2.7
apache-spark
dataframe
pyspark
apache-spark-sql
Spark sql queries vs dataframe functions
sql
performance
apache-spark
dataframe
apache-spark-sql
Updating a dataframe column in spark
python
dataframe
apache-spark
pyspark
apache-spark-sql
Get current number of partitions of a DataFrame
python
scala
dataframe
apache-spark
apache-spark-sql
How to connect to remote hive server from spark [duplicate]
apache-spark
hive
apache-spark-sql
spark-thriftserver
How to define a custom aggregation function to sum a column of Vectors?
scala
apache-spark
apache-spark-sql
aggregate-functions
apache-spark-ml
Prev
Next