各位在搭建完Hadoop后,是不是会偶尔出现进程挂掉的情况呢?这里的进程指的是NameNode/DataNode/ResourceManager/NodeManager等。在排除搭建时有问题外,写一套自动检测重启的脚本,就非常有必要了。
这里介绍一个shell方法,每间隔60s检测一次,发现异常即重启,并将重启记录写入MySQL,便于后续分析
首先要在MySQL的bigdata库建表:
1 2 3 4 5
| CREATE TABLE bigdata.hadoop_process_log ( `node` varchar(255) DEFAULT NULL COMMENT '节点', `typeName` varchar(255) DEFAULT NULL COMMENT '进程类型', `logtime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间' ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='大数据hadoop进程重启记录表';
|
脚本如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
| #!/bin/bash
while : do
rebootYarnFlag=0 rebootHdfsFlag=0
Fn_CheckHdp() { ssh -p 52001 $1 "$JAVA_HOME/bin/jps|egrep -w $2" > /dev/null }
Fn_RebootHdp() { case $1 in "yarn") ssh -p 52001 hadoop3-215 "sh $HADOOP_HOME/sbin/stop-yarn.sh; sh $HADOOP_HOME/sbin/start-yarn.sh" > /dev/null rebootYarnFlag=1 ;; "hdfs") ssh -p 52001 hadoop3-215 "sh $HADOOP_HOME/sbin/stop-dfs.sh;sh $HADOOP_HOME/sbin/start-dfs.sh" > /dev/null rebootHdfsFlag=1 ;; *) esac }
Fn_LoadHdpLog() { sql="insert into hadoop_process_log(node,typename) values(\"$1\",\"$2\")" mysql -hhadoop3-217 -uroot -p111111 -P23306 bigdata -e "$sql" }
Fn_CheckHdp hadoop3-215 ResourceManager if [ 0 -ne $? ] then echo "ResourceManager异常,即将重启YARN进程" Fn_RebootHdp yarn Fn_LoadHdpLog hadoop3-215 ResourceManager fi
Fn_CheckHdp hadoop3-215 NameNode if [ 0 -ne $? ] then echo "NameNode异常,即将重启HDFS进程" Fn_RebootHdp hdfs Fn_LoadHdpLog hadoop3-215 NameNode fi
for host in hadoop3-213 hadoop3-214 hadoop3-215 hadoop3-216 hadoop3-217 hadoop3-218 hadoop3-219 do echo ==================== $host ====================
Fn_CheckHdp $host NodeManager if [ 0 -ne $? ] && [ $rebootYarnFlag -eq 0 ] then echo "NodeManager异常,即将重启yarn进程" Fn_RebootHdp yarn Fn_LoadHdpLog $host NodeManager fi Fn_CheckHdp $host DataNode if [ 0 -ne $? ] && [ $rebootHdfsFlag -eq 0 ] then echo "DataNode异常,即将重启HDFS进程" Fn_RebootHdp hdfs Fn_LoadHdpLog $host DataNode fi done
echo "Hadoop进程检测完毕!"
sleep 60 done
|
各位小伙伴记得修改哦,不要把主机端口等参数弄错了!最后后台运行就可以了,日志表如下图: