Hadoop實(shí)踐(二)Mapreduce編程-創(chuàng)新互聯(lián)

Mapreduce 編程,本文以WordCount  為例:實(shí)現(xiàn)文件字符統(tǒng)計(jì)

成都創(chuàng)新互聯(lián)堅(jiān)持“要么做到,要么別承諾”的工作理念,服務(wù)領(lǐng)域包括:網(wǎng)站設(shè)計(jì)制作、成都做網(wǎng)站、企業(yè)官網(wǎng)、英文網(wǎng)站、手機(jī)端網(wǎng)站、網(wǎng)站推廣等服務(wù),滿足客戶于互聯(lián)網(wǎng)時(shí)代的太和網(wǎng)站設(shè)計(jì)、移動(dòng)媒體設(shè)計(jì)的需求,幫助企業(yè)找到有效的互聯(lián)網(wǎng)解決方案。努力成為您成熟可靠的網(wǎng)絡(luò)建設(shè)合作伙伴!

    在eclipse 里面搭建一個(gè)java項(xiàng)目,引入hadoop lib目錄下的jar,和 hadoop主目錄下的jar。

    新建WordCount 類:

package org.scf.wordcount;

import java.io.IOException;

import java.util.*;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.conf.*;

import org.apache.hadoop.io.*;

import org.apache.hadoop.mapred.*;

import org.apache.hadoop.util.*;

public class WordCount {

public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

   private final static IntWritable one = new IntWritable(1);

   private Text word = new Text();

   public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

   String line = value.toString();

    StringTokenizer tokenizer = new StringTokenizer(line);

    while (tokenizer.hasMoreTokens()) {

     word.set(tokenizer.nextToken());

     output.collect(word, one);

    }

   }

  }

 public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {

   public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

   int sum = 0;

    while (values.hasNext()) {

     sum += values.next().get();

    }

    output.collect(key, new IntWritable(sum));

   }

  }

  public static void main(String[] args) throws Exception {

   JobConf conf = new JobConf(WordCount.class);

   conf.setJobName("wordcount");

   conf.setOutputKeyClass(Text.class);

   conf.setOutputValueClass(IntWritable.class);

   conf.setMapperClass(Map.class);

   conf.setCombinerClass(Reduce.class);

   conf.setReducerClass(Reduce.class);

   conf.setInputFormat(TextInputFormat.class);

   conf.setOutputFormat(TextOutputFormat.class);

   FileInputFormat.setInputPaths(conf, new Path(args[0]));

   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

   JobClient.runJob(conf);

  }

}

2.編譯,運(yùn)行該類

 cd /home/Hadoop/

mkdir wordcount_classes

javac -classpath /usr/hadoop-1.0.4/hadoop-core-1.0.4.jar -d /home/Hadoop/wordcount_classes WordCount.java

 jar -cvf /home/Hadoop/wordcount.jar -C /home/Hadoop/wordcount_classes/ .

 hadoop dfs -put /home/Hadoop/test.txt  /user/root/wordcount/input/file2

 hadoop dfs -put /home/Hadoop/test1.txt  /user/root/wordcount/input/file3

 hadoop jar /home/Hadoop/wordcount.jar org.scf.wordcount.WordCount /user/root/wordcount/input /user/root/wordcount/output

hadoop dfs -ls /user/root/wordcount/output

 hadoop dfs -cat /user/root/wordcount/output/part-00000

另外有需要云服務(wù)器可以了解下創(chuàng)新互聯(lián)scvps.cn,海內(nèi)外云服務(wù)器15元起步,三天無理由+7*72小時(shí)售后在線,公司持有idc許可證,提供“云服務(wù)器、裸金屬服務(wù)器、高防服務(wù)器、香港服務(wù)器、美國(guó)服務(wù)器、虛擬主機(jī)、免備案服務(wù)器”等云主機(jī)租用服務(wù)以及企業(yè)上云的綜合解決方案,具有“安全穩(wěn)定、簡(jiǎn)單易用、服務(wù)可用性高、性價(jià)比高”等特點(diǎn)與優(yōu)勢(shì),專為企業(yè)上云打造定制,能夠滿足用戶豐富、多元化的應(yīng)用場(chǎng)景需求。

網(wǎng)頁(yè)名稱:Hadoop實(shí)踐(二)Mapreduce編程-創(chuàng)新互聯(lián)
標(biāo)題來源:http://muchs.cn/article2/ddhooc.html

成都網(wǎng)站建設(shè)公司_創(chuàng)新互聯(lián),為您提供云服務(wù)器Google、標(biāo)簽優(yōu)化、定制網(wǎng)站、網(wǎng)站排名、面包屑導(dǎo)航

廣告

聲明:本網(wǎng)站發(fā)布的內(nèi)容(圖片、視頻和文字)以用戶投稿、用戶轉(zhuǎn)載內(nèi)容為主,如果涉及侵權(quán)請(qǐng)盡快告知,我們將會(huì)在第一時(shí)間刪除。文章觀點(diǎn)不代表本網(wǎng)站立場(chǎng),如需處理請(qǐng)聯(lián)系客服。電話:028-86922220;郵箱:631063699@qq.com。內(nèi)容未經(jīng)允許不得轉(zhuǎn)載,或轉(zhuǎn)載時(shí)需注明來源: 創(chuàng)新互聯(lián)

微信小程序開發(fā)