python 代码段：爬取网页图片地址

时间：2020-02-17 22:55:33 阅读：69 评论：0 收藏：0 [点我收藏+]

import requests
import re
import pymysql

# 连接数据库
db = pymysql.connect(host=‘127.0.0.1‘,port=3306,db=‘db‘,user=‘root‘,passwd=‘root‘,charset=‘utf8‘)
cursor = db.cursor()
# cursor.execute(‘select * from table1‘)
# print(cursor.fetchall())

‘‘‘
get images
‘‘‘
def getImagesList(page=1):
    html = requests.get("http://www.abc.com/photo/list/?page={}".format(page)).text

    # 正则表达式
    reg = r‘data-original="(.*?).*?alt=(.*?)"‘
    # 增加匹配效率 S 多行匹配
    reg = re.compile(reg, re.S)
    imagesList = re.findall(reg,html)
    for i in imagesList:
        # print(i)
        image_url = i[0]
        image_title=i[1]
        cursor.execute("insert tablea(`name`,`url` values(‘{}‘,‘{}‘)".format(image_title,image_url))
        print(‘saving‘)
        db.commit()

for i in range(1,101):
    getImagesList(i)

python 代码段：爬取网页图片地址

原文：https://www.cnblogs.com/freeliver54/p/12323792.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)

python 代码段： 爬取网页图片地址

python 代码段：爬取网页图片地址