Hadoop FileAlreadyExistsException: Output directory hdfs://<namenode public dns>:9000/input already exists
Solution 1:
When you run the following command:
hadoop jar Tasks.jar ProgrammingAssigment/Tasks /input /output
The args array will contain the following elements:
args[0] ProgrammingAssigment/Tasks
args[1] /input
args[2] /output
Try omitting the ProgrammingAssigment/Tasks
parameter, my guess is that it is not needed. If it is needed for some reason, then use args[1]
and args[2]
in your code for the input and output directories, respectively.
Regarding the timeout you get, I have no idea. You could try increasing the maxRetries or sleepTime values that it mentions.