提高Hadoop Balancer 迁移block速度的方法

六月 19th, 2015 by klose | Posted under 互联网应用, 海量数据存储与处理.

如何提高Hadoop Balancer迁移block的速度?

1)增加DataNode用于balancer的bandwidth。

dfs.datanode.balance.bandwidthPerSec
52428800

这个值是DataXceiverServer上BlockBalanceThrottler控制的带宽大小。该单位是Byte,如果机器的网卡和交换机的带宽有限,可以适当降低该速度。Hadoop系统默认是1048576 (1MB)。

2)增加DataNode上转移block的Xceiver的个数上限。
DataNode上同时用于balancer的Xceiver的个数受到了BlockBananceThrottler限制。可以适当调大如下的配置。

dfs.datanode.balance.max.concurrent.moves
50

这个值默认是5。如果仅仅在Balancer的hdfs-site.xml修改配置而没有修改DataNode下的配置,Balancer会抛出如下的WARN LOG:

2015-06-18 15:54:24,253 WARN org.apache.hadoop.hdfs.server.balancer.Dispatcher: Failed to move blk_1366768180_1100055981849 with size=134217728 from 172.22.6.25:1004:DISK to 172.22.5.23:1004:DISK through 172.22.5.99:1004: block move is failed: Not able to receive block 1366768180 from /172.22.6.5:33544 because threads quota is exceeded.

查看DataXceiverServer,如果同时执行Balancer的Xceiver的个数upperlimit是5个,将DataNode上这个参数调大才可以增加迁移Block的速度。
附带SourceCode:

/** Check if the block move can start.

*

* Return true if the thread quota is not exceeded and

* the counter is incremented; False otherwise.

*/

synchronizedboolean acquire() {

if (numThreads >= maxThreads) {

returnfalse;

}

numThreads++;

returntrue;

}

(这里maxThreads 就是’dfs.datanode.balance.max.concurrent.moves’控制的)

需要特别注意的是,如果调高该值,会导致NameNode上有大量IPC忙于DataNode的BlockReceivedAndDeleted操作,

“IPC Server handler 19 on 52310″ daemon prio=10 tid=0x00007ffcfceac000 nid=0×8548 waiting on condition [0x00007fe3c80be000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
– parking to wait for <0x00007fe5d2ff7950> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197)
at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:945)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeLock(FSNamesystem.java:1495)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processIncrementalBlockReport(FSNamesystem.java:6115)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReceivedAndDeleted(NameNodeRpcServer.java:1106)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolServerSideTranslatorPB.java:209)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26386)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

Locked ownable synchronizers:
– <0x00007fe5d5049e80> (a java.util.concurrent.locks.ReentrantLock$FairSync)

从而导致rpc request的平均处理时间上升。
例如,该值调整成50之后,会导致Rpc Call length会变大,然后rpc排队时间以及处理时间都会变大。
CallQueueLength
rpc_queue_time

这会导致普通的操作变慢,因此,建议修改DataNode上可以用来做hdfs balancer的线程的个数,这个是通过dfs.datanode.balance.max.concurrent.moves调整。实时查看hdfs的负载状况,通过调整Balancer上该配置值来控制整体速度即可。

From Binospace, post 提高Hadoop Balancer 迁移block速度的方法

文章的脚注信息由WordPress的wp-posturl插件自动生成





Tags:

Do you have any comments on 提高Hadoop Balancer 迁移block速度的方法 ?