【爬虫】必应图片按关键词进行图片下载

时间：2021-02-28 00:09:52 阅读：29 评论：0 收藏：0 [点我收藏+]

 1 """
 2 关键是找到正确的url，然后判断组成url的参数的规律，多观察几个url即可得出规律
 3 """
 4 
 5 import requests
 6 from fake_useragent import UserAgent
 7 import os
 8 from lxml import etree
 9 import json
10 import time
11 
12 headers = {‘User-Agent‘: UserAgent().random}
13 
14 keyword = "周星驰"
15 url = ‘https://cn.bing.com/images/async?q={}&first={}&count=35&relp={}&tsc=ImageHoverTitle&mmasync=1‘
16 
17 if not os.path.exists(keyword):
18     os.mkdir(keyword)
19 os.chdir(keyword)
20 
21 first = 35
22 relp = 35
23 count = 1
24 
25 while True:
26     try:
27         html_str = requests.get(url.format(keyword, first, relp), headers=headers).content.decode()
28         html = etree.HTML(html_str)
29         pics_url = html.xpath("//a[@class=‘iusc‘]/@m")
30 
31         for pic_url in pics_url:
32             pic_url = json.loads(pic_url)[‘turl‘]
33             res = requests.get(pic_url, headers=headers).content
34 
35             with open(‘{}.jpg‘.format(count), "wb") as f:
36                 f.write(res)
37 
38             print(‘第{}张已下载完成‘.format(count))
39 
40             count += 1
41             time.sleep(0.5)
42 
43         first += 35
44         relp += 35
45 
46         if first > 35 * 5:
47             break
48 
49     except Exception as e:
50         print(e)

【爬虫】必应图片按关键词进行图片下载

原文：https://www.cnblogs.com/waterr/p/14456307.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)