Thursday, October 23, 2014

Hadoop_Troubleshooting: fair-scheduler.xml Does Not Take Effect After Revising

When using Fair Scheduler in YARN, we don't need to restart Hadoop cluster when fair-scheduler.xml is altered, as stated in official document:

The Fair Scheduler contains configuration in two places -- algorithm parameters are set in HADOOP_CONF_DIR/mapred-site.xml, while a separate XML file called the allocation file, located by default in HADOOP_CONF_DIR/fair-scheduler.xml, is used to configure pools, minimum shares, running job limits and preemption timeouts. The allocation file is reloaded periodically at runtime, allowing you to change pool settings without restarting your Hadoop cluster.

However, there are times when the changes in fair-scheduler.xml doesn't come into effect and we have no idea what's going wrong, here's the way to find it out!

Firstly, go into directory '$HADOOP_HOME/logs'.
cd $HADOOP_HOME/logs

Then open file 'yarn-hadoop-resourcemanager-*.log'.
vim yarn-hadoop-resourcemanager-[it_depends].log

In which, find 'ERROR' from bottom, looking for records like (especially the bold red part):
2014-10-24 10:39:39,037 ERROR org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager: Failed to reload fair scheduler config file - will use existing allocations.

java.util.IllegalFormatConversionException: d != org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl

        at java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:4045)

        at java.util.Formatter$FormatSpecifier.printInteger(Formatter.java:2748)

        at java.util.Formatter$FormatSpecifier.print(Formatter.java:2702)

        at java.util.Formatter.format(Formatter.java:2488)

        at java.util.Formatter.format(Formatter.java:2423)

        at java.lang.String.format(String.java:2797)

        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.loadQueue(QueueManager.java:460)

        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.reloadAllocs(QueueManager.java:312)

        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.reloadAllocsIfNecessary(QueueManager.java:243)

        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.update(FairScheduler.java:270)

        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$UpdateThread.run(FairScheduler.java:255)

        at java.lang.Thread.run(Thread.java:722)

From above, we can see the exception is thrown at line 460 in class 'org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager'.

Tracking to the source code, we can find the corresponding exception content:
LOG.warn(String.format("Queue %s has max resources %d less than min resources %d", queueName, maxQueueResources.get(queueName), minQueueResources.get(queueName)));


© 2014-2017 jason4zhu.blogspot.com All Rights Reserved 
If transfering, please annotate the origin: Jason4Zhu

1 comment: