hadoopMR統(tǒng)計分析日志腳本一例-創(chuàng)新互聯(lián)

#! /bin/sh

成都創(chuàng)新互聯(lián)擁有十余年成都網(wǎng)站建設(shè)工作經(jīng)驗,為各大企業(yè)提供網(wǎng)站制作、成都網(wǎng)站設(shè)計服務(wù),對于網(wǎng)頁設(shè)計、PC網(wǎng)站建設(shè)(電腦版網(wǎng)站建設(shè))、重慶APP軟件開發(fā)、wap網(wǎng)站建設(shè)(手機版網(wǎng)站建設(shè))、程序開發(fā)、網(wǎng)站優(yōu)化(SEO優(yōu)化)、微網(wǎng)站、空間域名等,憑借多年來在互聯(lián)網(wǎng)的打拼,我們在互聯(lián)網(wǎng)網(wǎng)站建設(shè)行業(yè)積累了很多網(wǎng)站制作、網(wǎng)站設(shè)計、網(wǎng)絡(luò)營銷經(jīng)驗,集策劃、開發(fā)、設(shè)計、營銷、管理等網(wǎng)站化運作于一體,具備承接各種規(guī)模類型的網(wǎng)站建設(shè)項目的能力。

############################

#split today and yesterday

for i in $(seq 10)

do

 echo " " >> /u1/hadoop-stat/stat.log

done

echo "begin["`date "+%Y-%m-%d" -d "-1 days"`"]" >> /u1/hadoop-stat/stat.log

############################

#remove file

function removeFilepathNotCurrentMonth(){

month=`date "+%Y-%m" -d "-1 days"`

for file in ` ls $1 `

do

if [ "$month" != "$file" ]; then

rm -rf $1"/"$file

fi

done

}

GYLOG_PATH="/u1/hadoop-stat/gylog"

NGINXLOG_PATH="/u1/hadoop-stat/nginxlog"

echo "begin remove gylogpath's files not in current month" >> /u1/hadoop-stat/stat.log

removeFilepathNotCurrentMonth $GYLOG_PATH

echo "begin remove nginxlogpath's files not in current month" >> /u1/hadoop-stat/stat.log

removeFilepathNotCurrentMonth $NGINXLOG_PATH

############################

#scp file between hosts

day=`date "+%Y-%m-%d" -d "-1 days"`

month=`date "+%Y-%m" -d "-1 days"`

gyfilename="gylog-"$day".log"

gyfilepath=$GYLOG_PATH"/"$month

if [ ! -d "$gyfilepath" ]; then

mkdir "$gyfilepath"

fi

if [ ! -f "$gyfilepath/$gyfilename" ]; then

echo "begin scp gylog" >> /u1/hadoop-stat/stat.log

scp gy02:/u1/logs/gylog/$gyfilename $gyfilepath/

fi

nginxfilename="ngxinlog-"$day".log"

nginxfilepath=$NGINXLOG_PATH"/"$month

if [ ! -d "$nginxfilepath" ]; then

mkdir "$nginxfilepath"

fi

if [ ! -f "$nginxfilepath/$nginxfilename" ]; then

echo "begin scp nginxlog" >> /u1/hadoop-stat/stat.log

scp gy01:/u1/logs/lbnginx/gy_access.log.1 $nginxfilepath/

mv $nginxfilepath/gy_access.log.1 $nginxfilepath/$nginxfilename

fi

###########################

#copy file to hadoop

GYLOG_HADOOP_PATH="/logs/gylog"

NGINXLOG_HADOOP_PATH="/logs/nginxlog"

monthhadoop=`date "+%Y-%m-%d" -d "-1 days"`

gyhadoopfilepath=$GYLOG_HADOOP_PATH"/"$monthhadoop

gyhadoopfilepathinput=$gyhadoopfilepath"/input"

gyhadoopfilepathoutput=$gyhadoopfilepath"/output"

/u1/hadoop-1.0.1/bin/hadoop dfs -test -e $gyhadoopfilepath

if [ $? -ne 0 ]; then

echo "begin mkdir gyhadoopfilepath in hadoop because of not exist:"$gyhadoopfilepath >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $gyhadoopfilepath

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $gyhadoopfilepathinput

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $gyhadoopfilepathoutput

fi

/u1/hadoop-1.0.1/bin/hadoop dfs -test -e $gyhadoopfilepathinput/$gyfilename

if [ $? -ne 0 ]; then

echo "begin copy gyhadoopfile to hadoop" >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop dfs -copyFromLocal $gyfilepath/$gyfilename $gyhadoopfilepathinput/

fi

nginxhadoopfilepath=$NGINXLOG_HADOOP_PATH"/"$monthhadoop

nginxhadoopfilepathinput=$nginxhadoopfilepath"/input"

nginxhadoopfilepathoutput=$nginxhadoopfilepath"/output"

/u1/hadoop-1.0.1/bin/hadoop dfs -test -e $nginxhadoopfilepath

if [ $? -ne 0 ]; then

echo "begin mkdir nginxhadoopfilepath in hadoop because of not exist:"$nginxhadoopfilepath >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $nginxhadoopfilepath

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $nginxhadoopfilepathinput

/u1/hadoop-1.0.1/bin/hadoop dfs -mkdir $nginxhadoopfilepathoutput

fi

/u1/hadoop-1.0.1/bin/hadoop dfs -test -e $nginxhadoopfilepathinput/$nginxfilename

if [ $? -ne 0 ]; then

echo "begin copy nginxhadoopfile to hadoop" >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop dfs -copyFromLocal $nginxfilepath/$nginxfilename $nginxhadoopfilepathinput/

fi

##########################

#begin hadoop stat

#echo "begin hadoop stat RequestTimeCount" >> /u1/hadoop-stat/stat.log

#/u1/hadoop-1.0.1/bin/hadoop jar /u1/hadoop-stat/stat.jar gy.log.mr.requestTime.RequestTimeCount $day

#echo "begin hadoop stat RequestCount" >> /u1/hadoop-stat/stat.log

#/u1/hadoop-1.0.1/bin/hadoop jar /u1/hadoop-stat/stat.jar gy.log.mr.request.RequestCount $day

echo "begin hadoop stat NginxCount" >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop jar /u1/hadoop-stat/stat.jar gy.log.mr.nginx.NginxCount $day

echo "begin hadoop stat GylogCount" >> /u1/hadoop-stat/stat.log

/u1/hadoop-1.0.1/bin/hadoop jar /u1/hadoop-stat/stat.jar gy.log.mr.gylog.GylogCount $day

##########################

#end for all

echo "end["`date "+%Y-%m-%d" -d "-1 days"`"]" >> /u1/hadoop-stat/stat.log

注:

/u1/hadoop-stat/stat.jar gy.log.mr.request.RequestCount

/u1/hadoop-stat/stat.jar gy.log.mr.nginx.NginxCount

/u1/hadoop-stat/stat.jar gy.log.mr.gylog.GylogCount

上面的mr是自定義的統(tǒng)計規(guī)則,可根據(jù)自己的需求開發(fā)

其他更多的主要是使用了hadoop的基本命令,相信大家了解hadoop的也很容易就能看懂的。

另外有需要云服務(wù)器可以了解下創(chuàng)新互聯(lián)scvps.cn,海內(nèi)外云服務(wù)器15元起步,三天無理由+7*72小時售后在線,公司持有idc許可證,提供“云服務(wù)器、裸金屬服務(wù)器、高防服務(wù)器、香港服務(wù)器、美國服務(wù)器、虛擬主機、免備案服務(wù)器”等云主機租用服務(wù)以及企業(yè)上云的綜合解決方案,具有“安全穩(wěn)定、簡單易用、服務(wù)可用性高、性價比高”等特點與優(yōu)勢,專為企業(yè)上云打造定制,能夠滿足用戶豐富、多元化的應(yīng)用場景需求。

當(dāng)前文章:hadoopMR統(tǒng)計分析日志腳本一例-創(chuàng)新互聯(lián)
網(wǎng)站URL:http://muchs.cn/article20/ipjco.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供定制網(wǎng)站、網(wǎng)站改版Google、響應(yīng)式網(wǎng)站全網(wǎng)營銷推廣、網(wǎng)站排名

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請盡快告知,我們將會在第一時間刪除。文章觀點不代表本網(wǎng)站立場,如需處理請聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時需注明來源: 創(chuàng)新互聯(lián)

綿陽服務(wù)器托管