requests
是一个很实用的Python HTTP客户端库,编写爬虫和测试服务器响应数据时经常会用到。可以说,Requests 完全满足如今网络的需求
官方文档 http://docs.python-requests.org/en/master/
requests模块是python中原生的基于网络请求的模块,其主要作用是用来模拟浏览器发起请求。功能强大,用法简洁高效。在爬虫领域中占据着半壁江山的地位。
无参数:
#爬取搜狗首页的页面数据 import requests #1指定url url = ‘https://www.sogou.com/‘ #2.发起请求 response = requests.get(url=url) #3获取响应数据 page_text = response.text #text返回的是字符串类型的数据 #持久化存储 with open(‘./sogou.html‘,‘w‘,encoding=‘utf-8‘) as fp: fp.write(page_text) print(‘over!‘)
带参数
#百度翻译 url = ‘https://fanyi.baidu.com/sug‘ word = input(‘enter a English word:‘) #请求参数的封装 data = { ‘kw‘:word } #UA伪装 headers = { ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36‘ } response = requests.post(url=url,data=data,headers=headers) #text:字符串 json():对象 obj_json = response.json() print(obj_json)
动态加载的数据
#爬取任意城市对应的肯德基餐厅的位置信息 city = input(‘enter a cityName:‘) url = ‘http://www.kfc.com.cn/kfccda/ashx/GetStoreList.ashx?op=keyword‘ data = { "cname": "", "pid": "", "keyword": city, "pageIndex": "2", "pageSize": "10", } #UA伪装 headers = { ‘User-Agent‘:‘Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36‘ } response = requests.post(url=url,headers=headers,data=data) json_text = response.text print(json_text)
原文:https://www.cnblogs.com/q455674496/p/11000236.html