首页 > 其他 > 详细

Nutch 1-build

时间:2014-01-16 00:10:46      阅读:424      评论:0      收藏:0      [点我收藏+]

1. install software

Cygwin,  jdk, ant, nutch

 

 

2. configure

  • environment variable 

JAVA_HOME = C:\PROGRA~1\Java\jdk1.7.0_45

ANT_HOME =  C:\PROGRA~1\Ant\apache-ant-1.9.3

PATH = ...

 

  • copy source file

copy apache-nutch-2.2.1-src folder into home of Cygwin

  • build

enter home/apache-nutch-2.2.1-src then build

1
ant

It takes about half an hour to download dependency.

 

3. test

bubuko.com,布布扣
Stan@Stan-PC ~/nutch/runtime/local
$ ls
bin  conf  lib  plugins  test

Stan@Stan-PC ~/nutch/runtime/local
$ bin/nutch
Usage: nutch COMMAND
where COMMAND is one of:
 inject         inject new urls into the database
 hostinject     creates or updates an existing host table from a text file
 generate       generate new batches to fetch from crawl db
 fetch          fetch URLs marked during generate
 parse          parse URLs marked during fetch
 updatedb       update web table after parsing
 updatehostdb   update host table after parsing
 readdb         read/dump records from page database
 readhostdb     display entries from the hostDB
 elasticindex   run the elasticsearch indexer
 solrindex      run the solr indexer on parsed batches
 solrdedup      remove duplicates from solr
 parsechecker   check the parser for a given url
 indexchecker   check the indexing filters for a given url
 plugin         load a plugin and run one of its classes main()
 nutchserver    run a (local) Nutch server on a user defined port
 junit          runs the given JUnit test
 or
 CLASSNAME      run the class named CLASSNAME
Most commands print help when invoked w/o parameters.

Stan@Stan-PC ~/nutch/runtime/local
bubuko.com,布布扣

 

continue...

Nutch 1-build

原文:http://www.cnblogs.com/harrysun/p/3516783.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!