Spark - Error "A master URL must be set in your configuration" when submitting an app

Solution 1:

The TLDR:

.config("spark.master", "local")

a list of the options for spark.master in spark 2.2.1

I ended up on this page after trying to run a simple Spark SQL java program in local mode. To do this, I found that I could set spark.master using:

SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.config("spark.master", "local")
.getOrCreate();

An update to my answer:

To be clear, this is not what you should do in a production environment. In a production environment, spark.master should be specified in one of a couple other places: either in $SPARK_HOME/conf/spark-defaults.conf (this is where cloudera manager will put it), or on the command line when you submit the app. (ex spark-submit --master yarn).

If you specify spark.master to be 'local' in this way, spark will try to run in a single jvm, as indicated by the comments below. If you then try to specify --deploy-mode cluster, you will get an error 'Cluster deploy mode is not compatible with master "local"'. This is because setting spark.master=local means that you are NOT running in cluster mode.

Instead, for a production app, within your main function (or in functions called by your main function), you should simply use:

SparkSession
.builder()
.appName("Java Spark SQL basic example")
.getOrCreate();

This will use the configurations specified on the command line/in config files.

Also, to be clear on this too: --master and "spark.master" are the exact same parameter, just specified in different ways. Setting spark.master in code, like in my answer above, will override attempts to set --master, and will override values in spark-defaults.conf, so don't do it in production. Its great for tests though.

also, see this answer. which links to a list of the options for spark.master and what each one actually does.

a list of the options for spark.master in spark 2.2.1

Solution 2:

Worked for me after replacing

SparkConf sparkConf = new SparkConf().setAppName("SOME APP NAME");

with

SparkConf sparkConf = new SparkConf().setAppName("SOME APP NAME").setMaster("local[2]").set("spark.executor.memory","1g");

Found this solution on some other thread on stackoverflow.

Solution 3:

Where is the sparkContext object defined, is it inside the main function?

I too faced the same problem, the mistake which i did was i initiated the sparkContext outside the main function and inside the class.

When I initiated it inside the main function, it worked fine.

Solution 4:

The default value of "spark.master" is spark://HOST:PORT, and the following code tries to get a session from the standalone cluster that is running at HOST:PORT, and expects the HOST:PORT value to be in the spark config file.

SparkSession spark = SparkSession
    .builder()
    .appName("SomeAppName")
    .getOrCreate();

"org.apache.spark.SparkException: A master URL must be set in your configuration" states that HOST:PORT is not set in the spark configuration file.

To not bother about value of "HOST:PORT", set spark.master as local

SparkSession spark = SparkSession
    .builder()
    .appName("SomeAppName")
    .config("spark.master", "local")
    .getOrCreate();

Here is the link for list of formats in which master URL can be passed to spark.master

Reference : Spark Tutorial - Setup Spark Ecosystem

Solution 5:

just add .setMaster("local") to your code as shown below:

val conf = new SparkConf().setAppName("Second").setMaster("local") 

It worked for me ! Happy coding !