Why does spark-submit and spark-shell fail with "Failed to find Spark assembly JAR. You need to build Spark before running this program."?

I was trying to run spark-submit and I get "Failed to find Spark assembly JAR. You need to build Spark before running this program." When I try to run spark-shell I get the same error. What I have to do in this situation.


Solution 1:

On Windows, I found that if it is installed in a directory that has a space in the path (C:\Program Files\Spark) the installation will fail. Move it to the root or another directory with no spaces.

Solution 2:

Your Spark package doesn't include compiled Spark code. That's why you got the error message from these scripts spark-submit and spark-shell.

You have to download one of pre-built version in "Choose a package type" section from the Spark download page.

Solution 3:

Try running mvn -DskipTests clean package first to build Spark.

Solution 4:

If your spark binaries are in a folder where the name of the folder has spaces (for example, "Program Files (x86)"), it didn't work. I changed it to "Program_Files", then the spark_shell command works in cmd.

Solution 5:

In my case, I install spark by pip3 install pyspark on macOS system, and the error caused by incorrect SPARK_HOME variable. It works when I run command like below:

PYSPARK_PYTHON=python3 SPARK_HOME=/usr/local/lib/python3.7/site-packages/pyspark python3 wordcount.py a.txt