Hadoop FileAlreadyExistsException: Output directory hdfs://<namenode public dns>:9000/input already exists

Solution 1:

When you run the following command:

hadoop jar Tasks.jar ProgrammingAssigment/Tasks /input /output

The args array will contain the following elements:

args[0]     ProgrammingAssigment/Tasks
args[1]     /input
args[2]     /output

Try omitting the ProgrammingAssigment/Tasks parameter, my guess is that it is not needed. If it is needed for some reason, then use args[1] and args[2] in your code for the input and output directories, respectively.

Regarding the timeout you get, I have no idea. You could try increasing the maxRetries or sleepTime values that it mentions.