路飞学城-Python爬虫实战密训-第3章

时间：2018-07-08 20:29:18 阅读：156 评论：0 收藏：0 [点我收藏+]

花了2天时间，终于完成了第2章的微信作业，提交后顿时觉得轻松很多。编程是一个神奇的事情，总是感觉bug就快解决了，然后一直排错，不知不觉就到早晨4点多，还是没能解决。人生就是这样吧，努力并不一定肯定会成功，但不努力，就肯定不能成功。相信自己，决不放弃，胜利一定会属于自己。

第3章部分学习笔记：

windows安装scrapy
(1)安装twisted
   a. pip3 install wheel
   b. 下载twisted https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
   c. 进入下载目录，执行pip3 install Twisted-xxx.whl
(2)安装scrapy
   d. pip3 install scrapy  -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
(3)安装pywin32
   e. pip3 install pywin32  -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

创建scrapy脚本
cmd，进入目标目录，输入：
scrapy startproject edwin                           #创建爬虫项目
cd edwin
scrapy genspider chouti chouti.com                  #创建抽屉网模板

#进入chouti.py，修改parse函数，确认希望实现的目标
#如果windows编码报错，在chouti.py中上方输入：
# import sys,os
# sys.stdout=os.TextIOWrapper(sys.stdout.buffer,encoding=‘gb18030‘)

scrapy cwl chouti --nolog                           #爬取页面，不显示日志


Available commands:
  bench         Run quick benchmark test
  check         Check spider contracts
  crawl         Run a spider
  edit          Edit spider
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  list          List available spiders
  parse         Parse URL (using its spider) and print the results
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

原文：https://www.cnblogs.com/shajing/p/9281137.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)