hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署

本篇內(nèi)容介紹了“hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署”的有關(guān)知識(shí),在實(shí)際案例的操作過(guò)程中,不少人都會(huì)遇到這樣的困境,接下來(lái)就讓小編帶領(lǐng)大家學(xué)習(xí)一下如何處理這些情況吧!希望大家仔細(xì)閱讀,能夠?qū)W有所成!

讓客戶滿意是我們工作的目標(biāo),不斷超越客戶的期望值來(lái)自于我們對(duì)這個(gè)行業(yè)的熱愛。我們立志把好的技術(shù)通過(guò)有效、簡(jiǎn)單的方式提供給客戶,將通過(guò)不懈努力成為客戶在信息化領(lǐng)域值得信任、有價(jià)值的長(zhǎng)期合作伙伴,公司提供的服務(wù)項(xiàng)目有:域名與空間、網(wǎng)絡(luò)空間、營(yíng)銷軟件、網(wǎng)站建設(shè)、郊區(qū)網(wǎng)站維護(hù)、網(wǎng)站推廣。

一、安裝版本:

JDK1.8.0_111-b14
hadoophadoop-2.7.3
zookeeperzookeeper-3.5.2

二、安裝步驟:  

    JDK的安裝和集群的依賴環(huán)境配置不再敘述

1、hadoop配置

    hadoop配置主要涉及hdfs-site.xml,core-site.xml,mapred-site.xml,yarn-site.xml四個(gè)文件。以下詳細(xì)介紹每個(gè)文件的配置。

  1. core-site.xml的配置
    <configuration>
    <property>
          <name>fs.defaultFS</name>
          <value>hdfs://cluster1</value>
          <description>HDFS namenode的邏輯名稱,也就是namenode HA,此值要對(duì)應(yīng)hdfs-site.xml里的dfs.nameservices</description>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/hadoop/tmp</value>
        <description>hdfs中namenode和datanode的數(shù)據(jù)默認(rèn)放置路徑,也可以在hdfs-site.xml中分別指定</description>
    </property>
    <property>
            <name>ha.zookeeper.quorum</name>
            <value>master:2181,salve1:2181,salve2:2181</value>
            <description>zookeeper集群的地址和端口,zookeeper集群的節(jié)點(diǎn)數(shù)必須為奇數(shù)</description>
    </property>
    </configuration>

  2. hdfs-site.xml的配置(重點(diǎn)配置)
    <configuration>
    <property>
        <name>dfs.name.dir</name>
        <value>/usr/hadoop/hdfs/name</value>
        <description>namenode的數(shù)據(jù)放置目錄</description>
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/usr/hadoop/hdfs/data</value>
        <description>datanode的數(shù)據(jù)放置目錄</description>
    </property>
    <property>
        <name>dfs.replication</name>
        <value>4</value>
        <description>數(shù)據(jù)塊的備份數(shù),默認(rèn)是3</description>
    </property>
    <property>
            <name>dfs.nameservices</name>
            <value>cluster1</value>
            <description>HDFS namenode的邏輯名稱,也就是namenode HA</description>
    </property>
    <property>
            <name>dfs.ha.namenodes.cluster1</name>
            <value>ns1,ns2</value>
            <description>nameservices對(duì)應(yīng)的namenode邏輯名</description>
    </property>
    <property>
            <name>dfs.namenode.rpc-address.cluster1.ns1</name>
            <value>master:9000</value>
            <description>指定namenode(ns1)的rpc地址和端口</description>
    </property>
    <property>
            <name>dfs.namenode.http-address.cluster1.ns1</name>
            <value>master:50070</value>
            <description>指定namenode(ns1)的web地址和端口</description>
    </property>
    <property>
            <name>dfs.namenode.rpc-address.cluster1.ns2</name>
            <value>salve1:9000</value>
            <description>指定namenode(ns2)的rpc地址和端口</description>
    </property>
    <property>
            <name>dfs.namenode.http-address.cluster1.ns2</name>
            <value>salve1:50070</value>
            <description>指定namenode(ns2)的web地址和端口</description>
    </property>
    <property>
            <name>dfs.namenode.shared.edits.dir</name>
            <value>qjournal://master:8485;salve1:8485;salve2:8485/cluster1 </value>
            <description>這是NameNode讀寫JNs組的uri,active NN 將 edit log 寫入這些JournalNode,而 standby NameNode 讀取這些 edit log,并作用在內(nèi)存中的目錄樹中</description>
    </property>
    <property>
            <name>dfs.journalnode.edits.dir</name>
            <value>/usr/hadoop/journal</value>
            <description>ournalNode 所在節(jié)點(diǎn)上的一個(gè)目錄,用于存放 editlog 和其他狀態(tài)信息。</description>
    </property>
    <property>  
               <name>dfs.ha.automatic-failover.enabled</name>  
               <value>true</value>
               <description>啟動(dòng)自動(dòng)failover。自動(dòng)failover依賴于zookeeper集群和ZKFailoverController(ZKFC),后者是一個(gè)zookeeper客戶端,用來(lái)監(jiān)控NN的狀態(tài)信息。每個(gè)運(yùn)行NN的節(jié)點(diǎn)必須要運(yùn)行一個(gè)zkfc</description>  
    </property>
    <property>
            <name>dfs.client.failover.proxy.provider.cluster1</name>
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
            <description>配置HDFS客戶端連接到Active NameNode的一個(gè)java類</description>
    </property>
    <property>
            <name>dfs.ha.fencing.methods</name>
            <value>sshfence</value>
            <description>解決HA集群腦裂問(wèn)題(即出現(xiàn)兩個(gè) master 同時(shí)對(duì)外提供服務(wù),導(dǎo)致系統(tǒng)處于不一致狀態(tài))。在 HDFS HA中,JournalNode 只允許一個(gè) NameNode 寫數(shù)據(jù),不會(huì)出現(xiàn)兩個(gè) active NameNode 的問(wèn)題,
    但是,當(dāng)主備切換時(shí),之前的 active NameNode 可能仍在處理客戶端的 RPC 請(qǐng)求,為此,需要增加隔離機(jī)制(fencing)將之前的 active NameNode 殺死。常用的fence方法是sshfence,要指定ssh通訊使用的密鑰dfs.ha.fencing.ssh.private-key-files和連接超時(shí)時(shí)間</description>
    </property>
    <property>
            <name>dfs.ha.fencing.ssh.private-key-files</name>
            <value>/home/hadoop/.ssh/id_rsa</value>
            <description>ssh通訊使用的密鑰</description>        
    </property>
    <property>
            <name>dfs.ha.fencing.ssh.connect-timeout</name>
            <value>30000</value>
            <description>連接超時(shí)時(shí)間</description> 
    </property>
    </configuration>

     

  3. mapred-site.xml的配置
    <configuration>
    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
            <description>指定運(yùn)行mapreduce的環(huán)境是yarn,與hadoop1截然不同的地方</description>
    </property>
    <property>
            <name>mapreduce.jobhistory.address</name>
            <value>master:10020</value>
             <description>MR JobHistory Server管理的日志的存放位置</description>
    </property>
    <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>master:19888</value>
            <description>查看歷史服務(wù)器已經(jīng)運(yùn)行完的Mapreduce作業(yè)記錄的web地址,需要啟動(dòng)該服務(wù)才行</description>
    </property>
    <property>
       <name>mapreduce.jobhistory.done-dir</name>
       <value>/data/hadoop/done</value>
       <description>MR JobHistory Server管理的日志的存放位置,默認(rèn):/mr-history/done</description>
    </property>
    <property>
       <name>mapreduce.jobhistory.intermediate-done-dir</name>
       <value>hdfs://mycluster-pha/mapred/tmp</value>
       <description>MapReduce作業(yè)產(chǎn)生的日志存放位置,默認(rèn)值:/mr-history/tmp</description>
    </property>
    </configuration>

     

  4. yarn-site.xml的配置
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
        <description>默認(rèn)</description>
    </property>
         <property>
        <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.address</name>
        <value>master:8032</value>
    </property>
    <property>
        <name>yarn.resourcemanager.scheduler.address</name>
        <value>master:8030</value>
    </property>
    <property>
        <name>yarn.resourcemanager.resource-tracker.address</name>
        <value>master:8031</value>
    </property>
    <property>
        <name>yarn.resourcemanager.admin.address</name>
        <value>master:8033</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
    </property>
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>1024</value>
    <description>該值配置小于1024時(shí),NM是無(wú)法啟動(dòng)的!會(huì)報(bào)錯(cuò):
NodeManager from  slavenode2 doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.</description>
  </property>
</configuration>

2.zookeeper配置

    zookeeper的配置主要是zoo.cfg和myid兩個(gè)文件

  1. conf/zoo.cfg配置:先將zoo_sample.cfg改成zoo.cfg
    cp  zoo_sample.cfg  zoo.cfg

  2. vi zoo.cfg
    dataDir:數(shù)據(jù)的放置路徑
    
    dataLogDir:log的放置路徑

    initLimit=10
    syncLimit=5
    clientPort=2181
    tickTime=2000
    dataDir=/usr/zookeeper/tmp/data
    dataLogDir=/usr/zookeeper/tmp/log
    server.1=master:2888:3888
    server.2=slave1:2888:3888
    server.3=slave2:2888:3888

  3. 在[master,slave1,slave2]節(jié)點(diǎn)的dataDir目錄新建文件myid
vi myid

    master節(jié)點(diǎn)編輯:1

    slave1節(jié)點(diǎn)編輯:2

    slave2節(jié)點(diǎn)編輯:3

    如下:

[hadoop@master data]$ vi myid 

1

三、啟動(dòng)集群

 1.zookeeper集群?jiǎn)?dòng)

    1.啟動(dòng)zookeeper集群,在三個(gè)節(jié)點(diǎn)全部啟動(dòng)
bin/zkServer.sh start
    2.查看集群zookeeper狀態(tài):zkServer.sh status,一個(gè)learer兩個(gè)follower。
[hadoop@master hadoop-2.7.3]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower
[hadoop@slave1 root]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: leader
[hadoop@slave2 root]$ zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper-3.5.2-alpha/bin/../conf/zoo.cfg
Client port found: 2181. Client address: localhost.
Mode: follower
    3.驗(yàn)證zookeeper(非必須): 執(zhí)行zkCli.sh
[hadoop@slave1 root]$ zkCli.sh
Connecting to localhost:2181
2016-12-18 02:05:03,115 [myid:] - INFO  [main:Environment@109] - Client environment:zookeeper.version=3.5.2-alpha-1750793, built on 06/30/2016 13:15 GMT
2016-12-18 02:05:03,118 [myid:] - INFO  [main:Environment@109] - Client environment:host.name=salve1
2016-12-18 02:05:03,118 [myid:] - INFO  [main:Environment@109] - Client environment:java.version=1.8.0_111
2016-12-18 02:05:03,120 [myid:] - INFO  [main:Environment@109] - Client environment:java.vendor=Oracle Corporation
2016-12-18 02:05:03,120 [myid:] - INFO  [main:Environment@109] - Client environment:java.home=/usr/local/jdk1.8.0_111/jre
2016-12-18 02:05:03,120 [myid:] - INFO  [main:Environment@109] - Client environment:java.class.path=/usr/local/zookeeper-3.5.2-alpha/bin/../build/classes:/usr/local/zookeeper-3.5.2-alpha/bin/../build/lib/*.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/slf4j-log4j12-1.7.5.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/slf4j-api-1.7.5.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/servlet-api-2.5-20081211.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/netty-3.10.5.Final.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/log4j-1.2.17.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jline-2.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jetty-util-6.1.26.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jetty-6.1.26.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/javacc.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jackson-mapper-asl-1.9.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/jackson-core-asl-1.9.11.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../lib/commons-cli-1.2.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../zookeeper-3.5.2-alpha.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../src/java/lib/*.jar:/usr/local/zookeeper-3.5.2-alpha/bin/../conf:.:/usr/local/jdk1.8.0_111/lib/dt.jar:/usr/local/jdk1.8.0_111/lib/tools.jar:/usr/local/zookeeper-3.5.2-alpha/bin:/usr/local/hadoop-2.7.3/bin
2016-12-18 02:05:03,120 [myid:] - INFO  [main:Environment@109] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:java.io.tmpdir=/tmp
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:java.compiler=<NA>
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:os.name=Linux
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:os.arch=amd64
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:os.version=3.10.0-327.22.2.el7.x86_64
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:user.name=hadoop
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:user.home=/home/hadoop
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:user.dir=/tmp/hsperfdata_hadoop
2016-12-18 02:05:03,121 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.free=52MB
2016-12-18 02:05:03,123 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.max=228MB
2016-12-18 02:05:03,123 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.total=57MB
2016-12-18 02:05:03,146 [myid:] - INFO  [main:ZooKeeper@855] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@593634ad
Welcome to ZooKeeper!
2016-12-18 02:05:03,171 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1113] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2016-12-18 02:05:03,243 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@948] - Socket connection established, initiating session, client: /127.0.0.1:56184, server: localhost/127.0.0.1:2181
2016-12-18 02:05:03,252 [myid:localhost:2181] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1381] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x200220f5fe30060, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0]

2.hadoop集群?jiǎn)?dòng)

    1.第一次配置啟動(dòng)

        1.1在三個(gè)節(jié)點(diǎn)上啟動(dòng)Journalnode deamons,然后jps,出現(xiàn)JournalNode進(jìn)程。

sbin/./hadoop-daemon.sh start journalnode
jps

JournalNode

        1.2格式化master上的namenode(任意一個(gè)),然后啟動(dòng)該節(jié)點(diǎn)的namenode。

bin/hdfs namenode -format
sbin/hadoop-daemon.sh start namenode

        1.3在另一個(gè)namenode節(jié)點(diǎn)slave1上同步master上的元數(shù)據(jù)信息

bin/hdfs namenode -bootstrapStandby

         1.4停止hdfs上的所有服務(wù)

sbin/stop-dfs.sh

        1.5初始化zkfc

bin/hdfs zkfc -formatZK

        1.6啟動(dòng)hdfs

sbin/start-dfs.sh

        1.7啟動(dòng)yarn

sbin/start-yarn.sh
    2.非第一次配置啟動(dòng)

        2.1直接啟動(dòng)hdfs和yarn即可,namenode、datanode、journalnode、DFSZKFailoverController都會(huì)自動(dòng)啟動(dòng)。

sbin/start-dfs.sh

        2.2啟動(dòng)yarn

sbin/start-yarn.sh

四、查看各節(jié)點(diǎn)的進(jìn)程

    4.1master

[hadoop@master hadoop-2.7.3]$ jps
26544 QuorumPeerMain
25509 JournalNode
25704 DFSZKFailoverController
26360 Jps
25306 DataNode
25195 NameNode
25886 ResourceManager
25999 NodeManager

    4.2slave1

[hadoop@slave1 root]$ jps
2289 DFSZKFailoverController
9400 QuorumPeerMain
2601 Jps
2060 DataNode
2413 NodeManager
2159 JournalNode
1983 NameNode

    4.3slave2

[hadoop@slave2 root]$ jps
11984 DataNode
12370 Jps
2514 QuorumPeerMain
12083 JournalNode
12188 NodeManager

“hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署”的內(nèi)容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業(yè)相關(guān)的知識(shí)可以關(guān)注創(chuàng)新互聯(lián)網(wǎng)站,小編將為大家輸出更多高質(zhì)量的實(shí)用文章!

網(wǎng)頁(yè)標(biāo)題:hadoop2.7.3+HA+YARN+zookeeper高可用集群如何部署
標(biāo)題網(wǎng)址:http://muchs.cn/article16/jopcdg.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供面包屑導(dǎo)航商城網(wǎng)站、網(wǎng)站內(nèi)鏈、定制開發(fā)、自適應(yīng)網(wǎng)站、標(biāo)簽優(yōu)化

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來(lái)源: 創(chuàng)新互聯(lián)

成都網(wǎng)頁(yè)設(shè)計(jì)公司