Python爬虫项目

时间：2021-09-06 00:16:10 阅读：22 评论：0 收藏：0 [点我收藏+]

用urlopen发送http get请求
用urlopen发送HTTP post请求
设置HTTP请求头

用urlopen发送http get请求

import  urllib.request
response=urllib.request.urlopen("https://www.cnblogs.com")
#用utf-8解码
print(response.read().decode(‘utf-8‘))

京东项目实战
获取HTTP报文信息

import urllib.request
response=urllib.request.urlopen("https://www.jd.com")
print("response的类型:",type(response))
print("status:",response.status," msg:",response.msg," version",response.version)
print(‘header:‘,response.headers,"  \n\n",response.getheaders())
print(‘headers-content-type:‘,response.getheader(‘Content-Type‘))
print(response.read().decode(‘utf-8‘))

用urlopen发送HTTP post请求

需要将数据转换为bytes类型

import urllib.request
data=bytes(urllib.parse.urlencode({‘name‘:‘Bill‘,‘age‘:30}),encoding=‘utf-8‘)
response=urllib.request.urlopen(‘http://httpbin.org/post‘,data=data)
print(response.read().decode(‘utf-8‘))

使用try except捕获超时异常

import  urllib.request
import  socket
import urllib.error
try:
    response=urllib.request.urlopen("http://httpbin.org/get",timeout=0.1)
except urllib.error.URLError as e:
    if isinstance(e.reason,socket.timeout):
        #isinstance() 函数来判断一个对象是否是一个已知的类型，类似 type()。
        print(‘超时‘)
print(‘continue....‘)

设置HTTP请求头

#修改了user-agent和host请求头，并且添加了自定义请求头，并提交给了web
from urllib import request,parse

url="http://httpbin.org/post"
headers={
    "User-Agent":"Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.7113.93 Safari/537.36",
    "Host":"127.0.0.1",
    "who":‘my python‘
}
dict={
    ‘name‘:‘Bill‘,
    ‘age‘:30
}

data=bytes(parse.urlencode(dict),encoding=‘utf-8‘)
req=request.Request(url=url,data=data,headers=headers)
print(str(req)+"\n\n")
response=request.urlopen(req)
print(response.read().decode(‘utf-8‘))

Python爬虫项目

原文：https://www.cnblogs.com/Zeker62/p/15227504.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)