陈瑞祺 1楼
1.修改conf/core-site.xml 增加 <property> <name>fs.checkpoint.period</name> <value>3600</value> <des cription>The number of seconds between two periodic checkpoints. </des cription> </property> <property> <name>fs.checkpoint.size</name> <value>67108864</value> <des cription>The size of the current edit log (in bytes) that triggers a periodic checkpoint even if the fs.checkpoint.period hasn't expired. </des cription> </property> <property> <name>fs.checkpoint.dir</name> <value>/data/work/hdfs/namesecondary</value> <des cription>Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If this is a comma-delimited list of directories then the image is replicated in all of the directories for redundancy. </des cription> </property> fs.checkpoint.period表示多长时间记录一次hdfs的镜像。默认是1小时。 fs.checkpoint.size表示一次记录多大的size,默认64M 2.修改conf/hdfs-site.xml 增加 <property> <name>dfs.http.address</name> <value>master:50070</value> <des cription> The address and the base port where the dfs namenode web ui will listen on. If the port is 0 then the server will start on a free port. </des cription> </property> 0.0.0.0改为namenode的IP地址 3.重启hadoop,然后检查是否启动是否成功 登录secondarynamenode所在的机器,输入jps查看secondarynamenode进程 进入secondarynamenode的目录/data/work/hdfs/namesecondary 正确的结果: 如果没有,请耐心等待,只有到了设置的checkpoint的时间或者大小,才会生成。 4.恢复 制造namenode宕机的情况 1) kill 掉namenode的进程 [root@master name]# jps 11749 NameNode 12339 Jps 11905 JobTracker [root@master name]# kill 11749 2)删除dfs.name.dir所指向的文件夹,这里是/data/work/hdfs/name [root@master name]# rm -rf * 删除name目录下的所有内容,但是必须保证name这个目录是存在的 3)从secondarynamenode远程拷贝namesecondary文件到namenode的namesecondary [root@master hdfs]# scp -r slave-001:/data/work/hdfs/namesecondary/ ./ 4)启动namenode [root@master /data]# hadoop namenode –importCheckpoint 正常启动以后,屏幕上会显示很多log,这个时候namenode就可以正常访问了