ERROR 1066: Unable to open iterator for alias in Pig, Generic solution

A very common, error message in Apache Pig is:

ERROR 1066: Unable to open iterator for alias

There are several questions where this error is mentioned, but none of them give a generic approach for dealing with it. Hence this question:

What to do when you get an ERROR 1066: Unable to open iterator for alias ?


Solution 1:

The message "ERROR 1066: Unable to open iterator for alias myAlias" suggests that there is something going wrong in the line where you use myAlias.

However, usually you will see this error if something went wrong BEFORE you are trying to use this alias. So the first thing to do is look up further along the error message, and see whether this is truely the first error that is thrown.

Here is what I found to be an efficient way to deal with this error when I did not easily spot an earlier error:

  1. Run the code untill just before you first define the the alias.
  2. Look carefully, whether you see any mention of ERROR (often it is in the last lines, but sometimes this can happen earlier)
  3. By now you probably have an error, if so: deal with it and go to 1.
  4. It is possible that you don't have an error before encountering the alias, in this case evaluate the line where the alias occurs.
  5. If the error occurs: Deal with it and go to 4; If no error occurs run the code untill just before you use the alias for the second time, and go to 3.

Notes:

  • To easily run PIG code line by line: Open pig on the command line (Simply type pigor pig -useHCatalog for example)
  • If you get confused, make sure you only define the alias once. (I believe this is good practice in general)

Solution 2:

I once received this error when using the SUM function. I was summing values that had nulls among them. After filtering out the null values in the prior lines, it worked properly.