out of Memory Error in Hadoop

I tried installing Hadoop following this http://hadoop.apache.org/common/docs/stable/single_node_setup.html document. When I tried executing this

bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+' 

I am getting the following Exception

java.lang.OutOfMemoryError: Java heap space

Please suggest a solution so that i can try out the example. The entire Exception is listed below. I am new to Hadoop I might have done something dumb . Any suggestion will be highly appreciated.

anuj@anuj-VPCEA13EN:~/hadoop$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
11/12/11 17:38:22 INFO util.NativeCodeLoader: Loaded the native-hadoop library
11/12/11 17:38:22 INFO mapred.FileInputFormat: Total input paths to process : 7
11/12/11 17:38:22 INFO mapred.JobClient: Running job: job_local_0001
11/12/11 17:38:22 INFO util.ProcessTree: setsid exited with exit code 0
11/12/11 17:38:22 INFO mapred.Task:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@e49dcd
11/12/11 17:38:22 INFO mapred.MapTask: numReduceTasks: 1
11/12/11 17:38:22 INFO mapred.MapTask: io.sort.mb = 100
11/12/11 17:38:22 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:949)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
11/12/11 17:38:23 INFO mapred.JobClient:  map 0% reduce 0%
11/12/11 17:38:23 INFO mapred.JobClient: Job complete: job_local_0001
11/12/11 17:38:23 INFO mapred.JobClient: Counters: 0
11/12/11 17:38:23 INFO mapred.JobClient: Job Failed: NA
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1257)
    at org.apache.hadoop.examples.Grep.run(Grep.java:69)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.examples.Grep.main(Grep.java:93)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
    at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

For anyone using RPM or DEB packages, the documentation and common advice is misleading. These packages install hadoop configuration files into /etc/hadoop. These will take priority over other settings.

The /etc/hadoop/hadoop-env.sh sets the maximum java heap memory for Hadoop, by Default it is:

   export HADOOP_CLIENT_OPTS="-Xmx128m $HADOOP_CLIENT_OPTS"

This Xmx setting is too low, simply change it to this and rerun

   export HADOOP_CLIENT_OPTS="-Xmx2048m $HADOOP_CLIENT_OPTS"

You can assign more memory by editing the conf/mapred-site.xml file and adding the property:

  <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx1024m</value>
  </property>

This will start the hadoop JVMs with more heap space.


Another possibility is editing hadoop-env.sh, which contains export HADOOP_CLIENT_OPTS="-Xmx128m $HADOOP_CLIENT_OPTS". Changing 128m to 1024m helped in my case (Hadoop 1.0.0.1 on Debian).


After trying so many combinations, finally I concluded the same error on my environment (Ubuntu 12.04, Hadoop 1.0.4) is due to two issues.

  1. Same as Zach Gamer mentioned above.
  2. don't forget to execute "ssh localhost" first. Believe or not! No ssh would throw an error message on Java heap space as well.