首页 > 编程语言 > 详细

1.4.1python下载网页

时间:2019-05-06 21:40:18      阅读:158      评论:0      收藏:0      [点我收藏+]
# -*- coding: utf-8 -*-

‘‘‘
Created on 2019年4月27日

@author: lenovo
‘‘‘

# import urllib3
# def download(url):
#     return urllib3.connection_from_url(url)
# 
# print(download(‘http://now.qq.com‘))





# 在python中,urllib2被urllib。request所取代

# import urllib.request
# def download(url):
#     return urllib.request.urlopen(url).read()
# 
# print(download(‘https://baijiahao.baidu.com/s?id=1632775818269407606&wfr=spider&for=pc‘))


# import urllib.request
# def download(url):
#     print("Downloading:" + url)
#     try:
#         html = urllib.request.urlopen(url).read()
#     except urllib.request.URLError as e:
#         print("Download error:" , e.reason)
#         html = None
#     return html
# 
# print(download("htp://www.baidu.co"))


# import urllib.request
# def download(url, num_retries=2):
#     try:
#         html = urllib.request.urlopen(url).read()
#     except urllib.request.URLError as e:
#         print("Download error:" , e.reason)
#         html = None
#         if num_retries > 0 :
#             if hasattr(e, "code") and 500 <= e.code < 600 :
#                 return download(url, num_retries-1)
#     return html
#     
# # print(download("http://httpstat.us/500"))
# print(download("http://www.meetup.com/"))

import urllib.request
def download(url, user_agent="wswp",num_retries=2):
    print("Downloading: " , url)
    headers = { User-agent: user_agent}
    request = urllib.request.Request(url, headers=headers)
    try:
        html = urllib.request.urlopen(request).read()
    except urllib.request.URLError as e:
        print(Download error: , e.reason)
        html = None
        if num_retries > 1 :
            if hasattr(e, code) and 500 <= e.code < 600:
                return download(url, user_agent, num_retries-1)
    return html

print(download("http://www.meetup.com/"))

 

1.4.1python下载网页

原文:https://www.cnblogs.com/xww115/p/10822196.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!