| 主机名称 | IP地址 | 角色 | 统一安装目录 | 统一安装用户 |
| sht-sgmhadoopnn-01 | 172.16.101.55 | namenode,resourcemanager |
/usr/local/hadoop(软连接) /usr/local/hadoop-2.7.4 /usr/local/zookeeper(软连接) /usr/local/zookeeper-3.4.9 |
root |
| sht-sgmhadoopnn-02 | 172.16.101.56 | namenode,resourcemanager | ||
| sht-sgmhadoopdn-01 | 172.16.101.58 | datanode,nodemanager,journalnode,zookeeper | ||
| sht-sgmhadoopdn-02 | 172.16.101.59 | datanode,nodemanager,journalnode,zookeeper | ||
| sht-sgmhadoopdn-03 | 172.16.101.60 | datanode,nodemanager,journalnode,zookeeper | ||
| sht-sgmhadoopdn-04 | 172.16.101.66 | datanode,nodemanager |
因sht-sgmhadoopdn-04节点异常,需要从当前集群剔除
1. 在active节点sht-sgmhadoopnn-01修改配置文件
slaves
sht-sgmhadoopdn-01 sht-sgmhadoopdn-02 sht-sgmhadoopdn-03
hdfs-site.xml
<property> <name>dfs.replication</name> <value>3</value> </property>
hdfs-site.xml
<property> <name>dfs.hosts</name> <value>/usr/local/hadoop/etc/hadoop/hdfs_includes</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/usr/local/hadoop/etc/hadoop/hdfs_excludes</value> </property>
yarn-site.xml
<property> <name>dfs.hosts</name> <value>/usr/local/hadoop/etc/hadoop/yarn_includes</value> </property> <property> <name>dfs.hosts.exclude</name> <value>/usr/local/hadoop/etc/hadoop/yarn_excludes</value> </property>
如果存在上述hosts配置,则可以直接进入第二步,否则,需要将上述参数添加至对应文件里,并重启hdfs和yarn集群
2. 添加白名单和黑名单
白名单
# echo sht-sgmhadoopdn-01 > /usr/local/hadoop/etc/hadoop/hdfs_includes # echo sht-sgmhadoopdn-02 > /usr/local/hadoop/etc/hadoop/hdfs_includes # echo sht-sgmhadoopdn-03 > /usr/local/hadoop/etc/hadoop/hdfs_includes
黑名单
# echo sht-sgmhadoopdn-04 > /usr/local/hadoop/etc/hadoop/hdfs_excludes # echo sht-sgmhadoopdn-04 > /usr/local/hadoop/etc/hadoop/yarn_excludes
在所有节点更新上述参数
3. 在active节点sht-sgmhadoopnn-01刷新节点信息
此命令告诉resourcemanager重新读取配置文件yarn-site.xml,将存在yarn_excludes文件中的节点退出当前yarn集群,并不在分配job到该节点
# yarn rmadmin -refreshNodes
输出log
2019-03-30 00:40:48,751 INFO org.apache.hadoop.conf.Configuration: found resource yarn-site.xml at file:/usr/local/hadoop-2.7.4/etc/hadoop/yarn-site.xml 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Setting the includes file to /usr/local/hadoop/etc/hadoop/yarn_includes 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Setting the excludes file to /usr/local/hadoop/etc/hadoop/yarn_excludes 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Refreshing hosts (include/exclude) list 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Adding sht-sgmhadoopdn-01 to the list of included hosts from /usr/local/hadoop/etc/hadoop/yarn_includes 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Adding sht-sgmhadoopdn-02 to the list of included hosts from /usr/local/hadoop/etc/hadoop/yarn_includes 2019-03-30 00:40:48,755 INFO org.apache.hadoop.util.HostsFileReader: Adding sht-sgmhadoopdn-03 to the list of included hosts from /usr/local/hadoop/etc/hadoop/yarn_includes 2019-03-30 00:40:48,756 INFO org.apache.hadoop.util.HostsFileReader: Adding sht-sgmhadoopdn-04 to the list of excluded hosts from /usr/local/hadoop/etc/hadoop/yarn_excludes 2019-03-30 00:40:48,757 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Deactivating Node sht-sgmhadoopdn-04:21570 as it is now DECOMMISSIONED 2019-03-30 00:40:48,758 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: sht-sgmhadoopdn-04:21570 Node Transitioned from RUNNING to DECOMMISSIONED 2019-03-30 00:40:48,756 INFO org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=root IP=172.16.101.55 OPERATION=refreshNodes TARGET=AdminService RESULT=SUCCESS 2019-03-30 00:40:48,772 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Removed node sht-sgmhadoopdn-04:21570 clusterResource: <memory:12288, vCores:24> 2019-03-30 00:40:48,909 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: Disallowed NodeManager nodeId: sht-sgmhadoopdn-04:21570 hostname: sht-sgmhadoopdn-04