Hadoop truncated/inconsistent counter name

There's nothing in Hadoop code which truncates counter names after its initialization. So, as you've already pointed out, mapreduce.job.counters.counter.name.max controls counter's name max length (with 64 symbols as default value).

This limit is applied during calls to AbstractCounterGroup.addCounter/findCounter. Respective source code is the following:

@Override
public synchronized T addCounter(String counterName, String displayName,
                                 long value) {
  String saveName = Limits.filterCounterName(counterName);
  ...

and actually:

public static String filterName(String name, int maxLen) {
  return name.length() > maxLen ? name.substring(0, maxLen - 1) : name;
}

public static String filterCounterName(String name) {
  return filterName(name, getCounterNameMax());
}

As you can see, the name of the counter is being saved truncated with respect to mapreduce.job.counters.max. On its turn, there's only a single place in Hadoop code where call to Limits.init(Configuration conf) is performed (called from LocalContainerLauncher class):

class YarnChild {

  private static final Logger LOG = LoggerFactory.getLogger(YarnChild.class);

  static volatile TaskAttemptID taskid = null;

  public static void main(String[] args) throws Throwable {
    Thread.setDefaultUncaughtExceptionHandler(new YarnUncaughtExceptionHandler());
    LOG.debug("Child starting");

    final JobConf job = new JobConf(MRJobConfig.JOB_CONF_FILE);
    // Initing with our JobConf allows us to avoid loading confs twice
    Limits.init(job);

I believe you need to perform the following steps in order to fix counter names issue you observe:

  1. Adjust mapreduce.job.counters.counter.name.max config value
  2. Restart YARN/MapReduce service
  3. Re-run your job

You still will see truncated counter names for old jobs I think.


getName() seems to be deprecated

Alternatively, getUri() that comes with a default maximum length of 255 can be used.

Documentation link: getUri()

Have not tried it personally, but it seems to be a possible fix to this problem.