后台扫描--绕过防火墙技术.py

时间：2020-09-21 22:04:12 阅读：63 评论：0 收藏：0 [点我收藏+]

模拟爬虫引擎绕过一些防火墙

 1 #搜索引擎爬虫模拟及模拟真实用户
 2 import requests
 3 import time
 4 
 5 headers={
 6     ‘Connection‘: ‘keep-alive‘,
 7     ‘Cache-Control‘: ‘max-age=0‘,
 8     ‘Upgrade-Insecure-Requests‘: ‘1‘,
 9     #模拟用户 Kit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.105 Safari/537.36
10     #模拟引擎 Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)
11     #更多爬虫引擎：https://www.cnblogs.com/iack/p/3557371.html
12     ‘User-Agent‘: ‘Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)‘,
13     ‘Sec-Fetch-Dest‘: ‘document‘,
14     ‘Accept‘: ‘text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9‘,
15     ‘Sec-Fetch-Site‘: ‘none‘,
16     ‘Sec-Fetch-Mode‘: ‘navigate‘,
17     ‘Sec-Fetch-User‘: ‘?1‘,
18     ‘Accept-Encoding‘: ‘gzip, deflate, br‘,
19     ‘Accept-Language‘: ‘zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7‘,
20    ‘Cookie‘: ‘xxx‘,#根据当前访问cookie
21 }
22 
23 for paths in open(‘php_b.txt‘,encoding=‘utf-8‘):
24     url=‘http://192.168.0.103:8081/‘
25     paths=paths.replace(‘\n‘,‘‘)
26     urls=url+paths
27     #如需测试加代理，或加入代理池需加代理
28     proxy = {
29         ‘http‘: ‘127.0.0.1:7777‘
30     }
31     try:
32         code=requests.get(urls,headers=headers,verify=False).status_code
33         print(urls+‘|‘+str(code))
34         if code==200 or code==403:
35             print(urls+‘|‘+str(code))
36     except Exception as err:
37         print(‘connecting error‘)
38         #time.sleep(3) 模拟用户需延时 引擎可用可不用（根据请求速度）

后台扫描--绕过防火墙技术.py

原文：https://www.cnblogs.com/trevain/p/13708608.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)