首页 > 其他 > 详细

我的第一个爬虫程序,爬图片

时间:2020-09-17 18:31:24      阅读:59      评论:0      收藏:0      [点我收藏+]

from typing import Dict

import os
import requests
import re
from bs4 import BeautifulSoup

# import platform

# print(platform.machine())

myheaders: Dict[str, str] = {
‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36‘}
url = ‘http://www.cntour.cn/‘
strtext = requests.get(url, headers=myheaders)
soup = BeautifulSoup(strtext.text, ‘lxml‘)
data = soup.select(‘#main>div>div.mtop.firstMod.clearfix>div.centerBox>ul.newsList>li>a‘)
# print(soup)
fileDir = ‘tupian‘
dataList = soup.find_all(‘img‘)

if not os.path.exists(fileDir):
print(‘create a filepath‘)
os.mkdir(os.path.join(os.getcwd(), fileDir))

fileDir = os.path.join(os.getcwd(), fileDir)
j = 1
for b in dataList:
picName = str(j) + ‘.jpg‘
picUrl = url + b.get(‘src‘)
# print(b.get(‘src‘),b.get(‘alt‘))

# print(os.path.join(os.getcwd(), filedir))
fPath = os.path.join(fileDir, picName)
print(fPath)
result = requests.get(url=picUrl)
with open(fPath, ‘wb+‘)as f: # 循环写入图片
f.write(result.content)
j = j + 1

print(‘success‘)

我的第一个爬虫程序,爬图片

原文:https://www.cnblogs.com/my85016629/p/13686326.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!