window下:
1.先安装well pip install wheel
2.先下载twisted 网址:https://www.lfd.uci.edu/~gohlke/pythonlibs/#twisted
3.安装twisted pip install Twisted-20.3.0-cp38-cp38-win32.whl
4.安装pywin32 pip install pywin32
3.安装scrapy pip install scrapy
linux下:
直接安装scrapy pip install scrapy
创建爬虫项目MyProjectMovie
1.创建项目,以爬取https://www.1905.com/dianyinghao/为例 scrapy startproject MyProjectMovie
2.进入项目 cd MyProjectMovie
3.创建爬虫应用文件 scrapy genspider movie www.xxx.com
项目文件夹目录
4.movie.py文件修改
import scrapy
class MovieSpider(scrapy.Spider):
name = ‘movie‘
# allowed_domains = [‘www.xxx.com‘]
start_urls = [‘https://www.1905.com/dianyinghao/‘]
def parse(self, response):
print(response.text)
print(response)
5.settings文件配置
USER_AGENT = ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36‘
ROBOTSTXT_OBEY = False
LOG_LEVEL = ‘ERROR‘
6.程序运行
scrapy crawl movie
原文:https://www.cnblogs.com/shiyi525/p/14256542.html