Hadoop3.1.1高可用ha安装部署–YARN配置及踩坑填坑
  • 分类:大数据
  • 发表:2019-05-29
  • 围观(3,326)

前言

通过前三章,我们已经使Hadoop集群拥有高可用(ha)能力。接下来,我们将开启YARN组件,为MapReduce计算做好准备。
YARN (Yet Another Resource Negotiator,另一种资源协调者)是一种新的 Hadoop 资源管理器,它是一个通用资源管理系统,可为上层应用提供统一的资源管理和调度,它的引入为集群在利用率、资源统一管理和数据共享等方面带来了巨大好处。
服务器节点角色配置如下:

节点名称 原角色 新增角色
node01 NN-1, ZKFC,JNN
node02 NN-2,DN,ZK,ZKFC,JNN NM
node03 DN,ZK,JNN RS,NM
node04 DN,ZK RS,NM

RS:为ResourceManager
NM:NodeManager
具体配置及操作如下所示:

1. 配置etc/hadoop/mapred-site.xml

vi etc/hadoop/mapred-site.xml

配置信息如下:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

mapreduce使用yarn作为调度框架。

2. 配置yarn-site.xml

vi etc/hadoop/yarn-site.xml

配置信息如下:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
       <name>yarn.resourcemanager.ha.enabled</name>
       <value>true</value>
     </property>
     <property>
       <name>yarn.resourcemanager.cluster-id</name>
       <value>cluster1</value>
     </property>
     <property>
       <name>yarn.resourcemanager.ha.rm-ids</name>
       <value>rm1,rm2</value>
     </property>
     <property>
       <name>yarn.resourcemanager.hostname.rm1</name>
       <value>node03</value>
     </property>
     <property>
       <name>yarn.resourcemanager.hostname.rm2</name>
       <value>node04</value>
     </property>
     <property>
       <name>yarn.resourcemanager.zk-address</name>
       <value>node02:2181,node03:2181,node04:2181</value>
     </property>
</configuration>

注意:此处的yarn.resourcemanager.cluster-id集群名称,与之前配置的hdf的namespace不能相同。

3. 配置yarn-env.sh

vi yarn-env.sh

配置代码如下:

export YARN_RESOURCEMANAGER_USER=root
export HADOOP_SECURE_DN_USER=yarn
export YARN_NODEMANAGER_USER=root

4. 分发配置文件至node02,node03,node04

cd etc/hadoop
scp mapred-site.xml yarn-site.xml yarn-env.sh root@node02:`pwd`
scp mapred-site.xml yarn-site.xml yarn-env.sh root@node03:`pwd`
scp mapred-site.xml yarn-site.xml yarn-env.sh root@node04:`pwd`

5. 启动yarn

start-yarn.sh

也可以重启整个hadoop集群:

start-all.sh

启动成功

6. 踩坑

目前来看,集群是成功启动了,并且yarn也成功启动。
但是,当运行示例wordcount的时候,会报各种莫名其妙的错误。先将运行过程中,相关的错误一一进行排雷。
错误1:类无法加载
类似出现以下堆栈信息:

Error: Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
</property>

For more detailed output, check the application tracking page: http://node04:8088/cluster/
app/application_1559123047303_0002 Then click on links to logs of each attempt.. Failing the application.

主要原因是没有设置mapreduce的classpath。
classpath具体内容可以使用下面命令得到:

hadoop classpath

mapred-site.xml增加配置如下:

<property>
      <name>mapreduce.application.classpath</name>
      <value>/opt/hadoop/hadoop-3.1.1/etc/hadoop:/opt/hadoop/hadoop-3.1.1/share/hadoop/common/lib/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/common/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs:/opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/lib/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/hdfs/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/lib/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/mapreduce/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/yarn:/opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/lib/*:/opt/hadoop/hadoop-3.1.1/share/hadoop/yarn/*</value>
    </property>

错误2:空指针

ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.NullPointerException
    at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:178)
    at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:122)
    at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:979)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1293)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1761)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1757)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1691)
Caused by: java.lang.NullPointerException
    at org.apache.hadoop.mapreduce.v2.app.client.MRClientService.getHttpPort(MRClientService.java:177)
    at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:159)
    ... 14 more

主要原因是没有配置yarn.resourcemanager.webapp.address.rm1和rm2服务器信息。默认配置端口号为8088,但在运行wordcount时,又要读取一遍,莫名其妙,看来老外也不靠谱。
在yarn-site.xml中增加如下代码:

 <property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>node03:8088</value>
</property>
<property>
    <name>yarn.resourcemanager.webapp.address.rm1</name>
    <value>node04:8088</value>
</property>

总结

特别注意yarn.resourcemanager.cluster-id的配置,要与之前的配置的不一样。第一次配置的时候,这里配置错了,所以导致一直不成功。
还有一些坑,在实践过程中,往往需要花费更多的时间去解决。目前本人遇到了上述问题,希望对大家有帮助。
至此,已经为Hadoop3.1.1集群增加YARN组件。


扫描关注,及时获取最新消息!

转载本站文章,请注明出处:呦呦工作室

Top