首页 > 其他 > 详细

利用hadoop自带程序运行wordcount

时间:2014-07-18 23:38:11      阅读:696      评论:0      收藏:0      [点我收藏+]

1.启动hadoop守护进程

   bin/hadoop start-all.sh

2.在hadoop的bin目录下建立一个input文件夹

   mkdire input

3.进入input目录之后,在input目录下新建两个文本文件,并想其写入内容

  echo "hello excuse me fuck thank you">test1.txt

  echo "hello how do you do thank you">test2.txt

4.进入hadoop的bin目录,输入jps命令,确认hadoop已经跑起来了

2195 SecondaryNameNode
2245 JobTracker
2055 NameNode
2664 Jps
2314 TaskTracker
2123 DataNode

5.把input文件上传到hdfs上

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -put input in

6.查看hdfs上的项目

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop dfs -ls in
Found 3 items
-rw-r--r--   1 jia supergroup       6148 2014-07-16 22:56 /user/jia/in/.DS_Store
-rw-r--r--   1 jia supergroup         18 2014-07-16 22:56 /user/jia/in/tex1.txt
-rw-r--r--   1 jia supergroup         22 2014-07-16 22:56 /user/jia/in/tex2.txt

7.利用自带的wordcount执行

JIAS-MacBook-Pro:hadoop-0.20.2 jia$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount in put
14/07/16 23:06:52 INFO input.FileInputFormat: Total input paths to process : 2
14/07/16 23:06:52 INFO mapred.JobClient: Running job: job_201407162246_0001
14/07/16 23:06:53 INFO mapred.JobClient:  map 0% reduce 0%
14/07/16 23:07:03 INFO mapred.JobClient:  map 100% reduce 0%
14/07/16 23:07:15 INFO mapred.JobClient:  map 100% reduce 100%
14/07/16 23:07:17 INFO mapred.JobClient: Job complete: job_201407162246_0001
14/07/16 23:07:17 INFO mapred.JobClient: Counters: 17
14/07/16 23:07:17 INFO mapred.JobClient:   Map-Reduce Framework
14/07/16 23:07:17 INFO mapred.JobClient:     Combine output records=7
14/07/16 23:07:17 INFO mapred.JobClient:     Spilled Records=14
14/07/16 23:07:17 INFO mapred.JobClient:     Reduce input records=7
14/07/16 23:07:17 INFO mapred.JobClient:     Reduce output records=4
14/07/16 23:07:17 INFO mapred.JobClient:     Map input records=2
14/07/16 23:07:17 INFO mapred.JobClient:     Map output records=7
14/07/16 23:07:17 INFO mapred.JobClient:     Map output bytes=68
14/07/16 23:07:17 INFO mapred.JobClient:     Reduce shuffle bytes=52
14/07/16 23:07:17 INFO mapred.JobClient:     Combine input records=7
14/07/16 23:07:17 INFO mapred.JobClient:     Reduce input groups=4
14/07/16 23:07:17 INFO mapred.JobClient:   FileSystemCounters
14/07/16 23:07:17 INFO mapred.JobClient:     HDFS_BYTES_READ=40
14/07/16 23:07:17 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=246
14/07/16 23:07:17 INFO mapred.JobClient:     FILE_BYTES_READ=88
14/07/16 23:07:17 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=30
14/07/16 23:07:17 INFO mapred.JobClient:   Job Counters 
14/07/16 23:07:17 INFO mapred.JobClient:     Launched map tasks=2
14/07/16 23:07:17 INFO mapred.JobClient:     Launched reduce tasks=1
14/07/16 23:07:17 INFO mapred.JobClient:     Data-local map tasks=2

利用hadoop自带程序运行wordcount,布布扣,bubuko.com

利用hadoop自带程序运行wordcount

原文:http://www.cnblogs.com/aijianiula/p/3850002.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!