Question1:
1. 为什么requests中需要添加headers?
“在爬虫的时候,如果不添加请求头,可能网站会阻止一个用户的登陆,此时我们就需要添加请求头来进行模拟伪装”
2. requests库如何保持会话?
如果要跳过验证码登录并保持登录状态(即保持一个会话),只需要提取登录后的cookie,并把它添加到一个requests库的Session对象即可
注: requests只能保持 cookiejar 类型的cookie,而我们手动构建的cookie是dict类型的。所以要把dict转为 cookiejar类型
3. 向session对象中添加cookie方式:
ref: https://blog.csdn.net/yong1xin/article/details/88542045
Demo:
import requests from requests.cookies import RequestsCookieJar import urllib3 import json urllib3.disable_warnings() url=u‘https://12apxqa2019.gencos.com/APXLogin/api/authenticate‘ # r=requests.get(url) getPayload={‘loginname‘:‘admin‘, ‘password‘:‘advs‘} r=requests.get(url, getPayload, verify=False) print("url is", r.url) print("json is: ", r.json) print(r.status_code) print(r.encoding) print("Cookie is: ", r.cookies) print("headers is: ", r.headers) r.encoding=‘unicode‘ # headers= {‘Server‘: ‘Microsoft-IIS/10.0‘, ‘X-Powered-By‘: ‘ASP.NET‘, ‘Set-Cookie‘: ‘AOAuth1=SessionCode=6F6A2645-F7F0-49B7-9E06-AB2C8560817A; expires=Mon, 14 Oct 2019 14:16:08 GMT;secure= ;path=/;version=1;HttpOnly, AODBID1=DatabaseIdentifierCode=MQAyAEEAUABYAFEAQQAyADAAMQA5AC4AQQBQAFgARgBpAHIAbQAEQ; expires=Mon, 14 Oct 2019 14:16:08 GMT;secure= ;path=/;version=1;HttpOnly‘, ‘Date‘: ‘Mon, 14 Oct 2019 13:51:08 GMT‘, ‘Connection‘: ‘close‘, ‘Content-Length‘: ‘0‘} headers={‘Accept‘: ‘application/json, text/plain, */*‘, ‘Accept-Encoding‘: ‘gzip, deflate, br‘, ‘Accept-Language‘: ‘en-US,en;q=0.9,zh-CN;q=0.8,zh;q=0.7‘, ‘Connection‘: ‘keep-alive‘, ‘Cookie‘: ‘AOAuth1=SessionCode=2F9C444F-5C5E-4085-A254-F7C3C6E13E47; AODBID1=DatabaseIdentifierCode=VgBNAEEATwBTAFUAUABEAEIAMgBcAFMAUQBMAF8ATABBAFQASQBOADEAXwAyADAAMQA2AC4AQQBQAFgARgBpAHIAbQAEQ‘, ‘Host‘: ‘vmaosupdb2.gencos.com‘, ‘Referer‘: ‘https://vmaosupdb2.gencos.com/APXUILogin/‘, ‘Sec-Fetch-Mode‘: ‘cors‘, ‘Sec-Fetch-Site‘:‘same-origin‘ , ‘SessionCode‘: ‘2F9C444F-5C5E-4085-A254-F7C3C6E13E47‘, ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36‘ } # demo: 将cookies添加到requests session对象中 # 创建cookiejar实例 cookie_jar=RequestsCookieJar() #将获取的cookie转换为字典 cookie_dict=requests.utils.dict_from_cookiejar(r.cookies) #将字典转为CookieJar: cookies = requests.utils.cookiejar_from_dict(cookie_dict, cookiejar=None, overwrite=True) #创建session,添加cookies # 往下使用requests的地方,直接使用session即可,session就会保存服务器发送过来的cookie信息 session=requests.session() session.cookies=cookies url2=u‘https://vmaosupdb2.gencos.com/APXLogin/api/internal/InterestedParty/search?$d=y&$m=n&$s=40&$c=-226021101‘ payload2=json.dumps({ "Search": { "Query": { "Joins": [], "CriteriaList": [], "Entity": "InterestedParty" }, "Options": { "PageNumber": 0, "PageSize": 50, "OrderBy": [ "ContactName asc", "ReportHeading asc" ], "UseDistinct": False } } }) r2=session.post(url2, payload2, verify=False) print("newcookie is: ", r2.cookies) print("reason is what") r2.raise_for_status() print(r2.text) print(r2.url) print(r2.json) print(r2.status_code) print(r2.content)
原文:https://www.cnblogs.com/TestBetter/p/11674732.html