New posts in apache-spark-sql

Passing a data frame column and external list to udf under withColumn

python apache-spark pyspark apache-spark-sql user-defined-functions

Read multiline JSON in Apache Spark

json apache-spark apache-spark-sql

Spark / Scala: forward fill with last observation

scala apache-spark apache-spark-sql

Renaming columns for PySpark DataFrame aggregates

dataframe apache-spark pyspark apache-spark-sql

Join two data frames, select all columns from one and some columns from the other

dataframe apache-spark pyspark apache-spark-sql

Create column from array of struct Pyspark

python apache-spark pyspark apache-spark-sql

Updating json column using window cumulative via pyspark

python sql apache-spark pyspark apache-spark-sql

MatchError while accessing vector column in Spark 2.0

scala apache-spark apache-spark-sql apache-spark-mllib apache-spark-ml

Concatenate two PySpark dataframes

python apache-spark pyspark apache-spark-sql

How to zip two (or more) DataFrame in Spark

scala apache-spark dataframe apache-spark-sql

Defining a UDF that accepts an Array of objects in a Spark DataFrame?

scala apache-spark dataframe apache-spark-sql user-defined-functions

What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?

performance apache-spark hadoop apache-spark-sql

Filter Spark DataFrame based on another DataFrame that specifies denylist criteria

dataframe apache-spark pyspark apache-spark-sql

Pivot String column on Pyspark Dataframe

python apache-spark dataframe pyspark apache-spark-sql

Including null values in an Apache Spark Join

sql scala apache-spark join apache-spark-sql

How to save DataFrame directly to Hive?

scala apache-spark hive apache-spark-sql

Renaming column names of a DataFrame in Spark Scala

scala apache-spark dataframe apache-spark-sql

How to interact with each element of an ArrayType column in pyspark?

apache-spark pyspark apache-spark-sql

What are the various join types in Spark?

scala apache-spark apache-spark-sql spark-dataframe apache-spark-2.0

winutils error:Error while running spark on windows

scala apache-spark hadoop apache-spark-sql