python爬图准备多线程

时间：2019-05-06 20:31:33 阅读：341 评论：0 收藏：0 [点我收藏+]

#codeing = utf-8
#官方3.0版本已经把urllib2,urlparse等五个模块都并入了urllib中
import urllib.request
import re

def getHtml(url):
    #打开连接
    page = urllib.request.urlopen(url)
    #获取网页内容
    html = page.read()
    print(html)
    return html

def getImg(html):
    #正则表达式
    reg = r‘src="(.+?\.jpg)" alt=‘
    imgre = re.compile(reg)
    #以列表的形式返回能匹配的子串
    imgList = re.findall(imgre,html.decode(‘utf-8‘))
    x=0
    for imgurl in imgList:
        #把爬取到的资源保存到本地
        urllib.request.urlretrieve(imgurl,‘%s.jpg‘ % x)
        x+=1
    return imgList
#输入你想要爬取的网站
#url=‘https://www.113yq.com/pic/html28/index_3.html‘
html=getHtml(new_url)
#html=getHtml("http://pic.yxdown.com/list/0_0_1.html")
print(getImg(html))

--------------------- 
作者：热心市民大G 
来源：CSDN 
原文：https://blog.csdn.net/tyt_xiaotao/article/details/80209398 
版权声明：本文为博主原创文章，转载请附上博文链接！

python爬图准备多线程

原文：https://www.cnblogs.com/xiaohe520/p/10821679.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)

python爬图 准备多线程

python爬图准备多线程