Newbetuts
.
New posts in apache-spark
Spark: how to get the number of written rows?
apache-spark
Pyspark : forward fill with last observation for a DataFrame
apache-spark
pyspark
apache-spark-sql
spark-dataframe
Importing spark.implicits._ in scala
scala
apache-spark
Difference in Used, Committed and Max Heap Memory
java
apache-spark
memory-management
jvm
spark-streaming
Why does format("kafka") fail with "Failed to find data source: kafka." (even with uber-jar)?
apache-spark
apache-spark-sql
spark-structured-streaming
uberjar
Why does a job fail with "No space left on device", but df says otherwise?
apache-spark
Difference between == and === in Scala, Spark
scala
apache-spark
Apache Spark: What is the equivalent implementation of RDD.groupByKey() using RDD.aggregateByKey()?
apache-spark
rdd
pyspark
TaskSchedulerImpl: Initial job has not accepted any resources;
java
apache-spark
cassandra
datastax
Apache Spark Python Cosine Similarity over DataFrames
python
apache-spark
pyspark
apache-spark-sql
cosine-similarity
Replace null values in Spark DataFrame
scala
apache-spark
dataframe
PySpark groupByKey returning pyspark.resultiterable.ResultIterable
python
apache-spark
pyspark
How do I install pyspark for use in standalone scripts?
python
apache-spark
Couldn't run pyspark on windows cmd and conda cmd
python
apache-spark
pyspark
conda
Why does Spark think this is a cross / Cartesian join
apache-spark
dataframe
pyspark
apache-spark-sql
How to run multiple jobs in one Sparkcontext from separate threads in PySpark?
python
multithreading
apache-spark
pyspark
Apache Spark, add an "CASE WHEN ... ELSE ..." calculated column to an existing DataFrame
scala
apache-spark
dataframe
apache-spark-sql
how to creat spark dataframe from a Map(string,any) scala?
scala
apache-spark
Convert null values to empty array in Spark DataFrame
apache-spark
dataframe
apache-spark-sql
apache-spark-1.5
Apache spark dealing with case statements
apache-spark
pyspark
spark-dataframe
rdd
pyspark-sql
Prev
Next