Newbetuts
.
New posts in apache-spark-sql
Automatically and Elegantly flatten DataFrame in Spark SQL
scala
apache-spark
apache-spark-sql
How do I call a UDF on a Spark DataFrame using JAVA?
java
apache-spark
apache-spark-sql
user-defined-functions
Apache Spark -- Assign the result of UDF to multiple dataframe columns
python
apache-spark
pyspark
apache-spark-sql
user-defined-functions
PySpark: withColumn() with two conditions and three outcomes
apache-spark
hive
pyspark
apache-spark-sql
hiveql
aggregate function Count usage with groupBy in Spark
java
scala
apache-spark
pyspark
apache-spark-sql
Apache Spark: Get number of records per partition
scala
apache-spark
hadoop
apache-spark-sql
partitioning
PySpark - rename more than one column using withColumnRenamed
apache-spark
pyspark
apache-spark-sql
rename
Perform a typed join in Scala with Spark Datasets
scala
apache-spark
join
apache-spark-sql
apache-spark-dataset
Create new Dataframe with empty/null field values
scala
apache-spark
dataframe
apache-spark-sql
Pyspark: Filter dataframe based on multiple conditions
sql
filter
pyspark
apache-spark-sql
pyspark-sql
How to compare two dataframe and print columns that are different in scala
scala
apache-spark
apache-spark-sql
compare
Median / quantiles within PySpark groupBy
apache-spark
pyspark
apache-spark-sql
pyspark-sql
PySpark: multiple conditions in when clause
python
apache-spark
dataframe
pyspark
apache-spark-sql
Generate a Spark StructType / Schema from a case class
apache-spark
apache-spark-sql
How to flatten a struct in a Spark dataframe?
java
apache-spark
pyspark
apache-spark-sql
Why using a UDF in a SQL query leads to cartesian product?
sql
apache-spark
apache-spark-sql
Apache Spark how to append new column from list/array to Spark dataframe
scala
apache-spark
dataframe
apache-spark-sql
What are possible reasons for receiving TimeoutException: Futures timed out after [n seconds] when working with Spark [duplicate]
scala
apache-spark
apache-spark-sql
spark-dataframe
PySpark: how to resample frequencies
apache-spark
pyspark
apache-spark-sql
time-series
Take n rows from a spark dataframe and pass to toPandas()
python
apache-spark-sql
spark-dataframe
Prev
Next