第五天

时间：2019-06-29 12:14:49 阅读：89 评论：0 收藏：0 [点我收藏+]

selenium请求库实现爬取京东商品信息:
　　　　导入selenium库

　　　　利用try-catch实现对是否能连接进行异常检测与捕获

　　　　get方式向京东主页发送请求

　　　　通过id查找iput输入框

　　　　send_kyes为当前标签传值

　　　　通过send_keys按回车键实现查询

　　　 通过find_elements_by_class_name爬取每个商品

　　　利用循环遍历取每个商品的名称、url（通过session的get.Attribute（）方法获取）、价格和评价人数：

　　　　find_element_by_css_selector(‘.p-name em‘).text

　　　　最后存入jd.txt文件

　　　　关闭驱动

 1  1 from selenium import webdriver
 2  2 # 导入键盘Keys
 3  3 from selenium.webdriver.common.keys import Keys
 4  4 import time
 5  5 
 6  6 driver = webdriver.Chrome()
 7  7 
 8  8 # 检测代码块
 9  9 try:
10 10     # 隐式等待，等待标签加载
11 11     driver.implicitly_wait(10)
12 12 
13 13     # 往京东主页发送请求
14 14     driver.get(‘https://www.jd.com/‘)
15 15 
16 16     # 通过id查找input输入框
17 17     input_tag = driver.find_element_by_id(‘key‘)
18 18 
19 19     # send_keys为当前标签传值
20 20     input_tag.send_keys(‘中华字典‘)
21 21 
22 22     # 按键盘的回车键
23 23     input_tag.send_keys(Keys.ENTER)
24 24 
25 25     time.sleep(3)
26 26 
27 27     ‘‘‘
28 28     爬取京东商品信息:
29 29         公仔
30 30             名称
31 31             url
32 32             价格
33 33             评价
34 34     ‘‘‘
35 35     # element 找一个
36 36     # elements 找多个
37 37     # 查找所有的商品列表
38 38     good_list = driver.find_elements_by_class_name(‘gl-item‘)
39 39     # print(good_list)
40 40 
41 41     # 循环遍历每一个商品
42 42     for good in good_list:
43 43         # 通过属性选择器查找商品详情页url
44 44         # url
45 45         good_url = good.find_element_by_css_selector(‘.p-img a‘).get_attribute(‘href‘)
46 46         print(good_url)
47 47 
48 48         # 名称
49 49         good_name = good.find_element_by_css_selector(‘.p-name em‘).text
50 50         print(good_name)
51 51 
52 52         # 价格
53 53         good_price = good.find_element_by_class_name(‘p-price‘).text
54 54         print(good_price)
55 55 
56 56         # 评价数
57 57         good_commit = good.find_element_by_class_name(‘p-commit‘).text
58 58         print(good_commit)
59 59 
60 60 
61 61         str1 = f‘‘‘
62 62         url: {good_url}
63 63         名称: {good_name}
64 64         价格: {good_price}
65 65         评价: {good_commit}
66 66         \n
67 67         ‘‘‘
68 68         # 把商品信息写入文本中
69 69         with open(‘jd.txt‘, ‘a‘, encoding=‘utf-8‘) as f:
70 70             f.write(str1)
71 71 
72 72 
73 73     time.sleep(10)
74 74 
75 75 # 捕获异常
76 76 except Exception as e:
77 77     print(e)
78 78 
79 79 # 最后都会把驱动浏览器关闭掉
80 80 finally:
81 81     driver.close()

第五天

原文：https://www.cnblogs.com/feijigege/p/11106139.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)