首页 > 编程语言 > 详细

Python爬虫-简单利用urllib.request和正则表达式抓取职位信息

时间:2020-04-08 10:05:59      阅读:46      评论:0      收藏:0      [点我收藏+]

1: 利用urllib.request和正则表达式抓取职位信息

 

技术分享图片
 1 # coding:utf-8
 2 
 3 import re
 4 import requests
 5 import urllib.request
 6 
 7 #利用urllib和re正则提取网页数据
 8 
 9 ‘‘‘
10 url = ‘https://search.51job.com/list/020000,000000,0124,01,9,99,%2520,2,1.html?lang=c&stype=&postchannel=0000&workyear=99&cotype=99&degreefrom=99&jobterm=99&companysize=99&providesalary=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&w‘
11 # response = requests.get(url)
12 # response.encoding=‘gbk‘
13 # wbdata =response.text
14 
15 wbdata=urllib.request.urlopen(url).read().decode(‘gbk‘)
16 # print(len(wbdata))
17 
18 pat =‘<a target="_blank" title="(.*?)"‘
19 data = re.compile(pat).findall(wbdata)
20 # print(data)
21 
22 #输出到文件
23 # with open(‘jobs.txt‘,‘w‘) as f:
24 #     for k in range(len(data)):
25 #         print(data[k])
26 #         f.write(data[k]+‘\n‘)
27 
28 #输出至console
29 for k in range(len(data)):
30     print(data[k])
31 ‘‘‘
32 print("--"*20)
33 #超时设置
34 # for i in range(0,20):
35 #     try:
36 #         file=urllib.request.urlopen("http://baidu.com",timeout=0.2).read().decode(‘gbk‘)
37 #         print(len(file))
38 #     except Exception as err:
39 #         print("出现异常:可能网页超时!"+str(err))
40 
41 #get请求实战-获取51job职位信息
42 keywd="Python"
43 pat1=<div class="el">.*?title="(.*?)" href="(http.*?)".*?<span class="t4">(.*?)</span>.*?</div>
44 pat2=<span class="t4">(.*?)</span>
45 
46 # keywd=urllib.request.quote(keywd)
47 for i in range(1,11):
48     url="https://search.51job.com/list/020000,000000,0000,00,9,99,"+keywd+",2,"+str(i)+".html"
49     file=urllib.request.urlopen(url)
50     # print(file.geturl())
51     data=file.read().decode(gbk)
52     print("----------------第"+str(i)+"页-----------------")
53     rst1=re.compile(pat1,re.S).findall(data)
54     # rst2 = re.compile(pat2).findall(data)
55     # rst=list(zip(rst1,rst2))
56     for j in range(0,len(rst1)):
57         print(rst1[j])
58         with open(jobs.txt,a) as f:
59             f.write(str(rst1[j]) + \n)
60 
61     # rst2 = re.compile(pat2).findall(data)
62     # for z in range(0, len(rst2)):
63     #     print(rst2[z])
View Code

 

2: 抓取信息如下

技术分享图片
  1 ----------------第1页-----------------
  2 (自动化测试工程师Selenium, https://jobs.51job.com/shanghai-ypq/114603381.html?s=01&t=5, 1-1.5万/月)
  3 (大数据研发工程师, https://jobs.51job.com/shanghai/67963188.html?s=01&t=6, ‘‘)
  4 (Python爬虫工程师, https://jobs.51job.com/shanghai-pdxq/121129060.html?s=01&t=0, 1-1.5万/月)
  5 (Python高级开发工程师, https://jobs.51job.com/shanghai-pdxq/114332244.html?s=01&t=0, 2-4万/月)
  6 (Python爬虫工程师, https://jobs.51job.com/shanghai/120028078.html?s=01&t=0, 1-1.5万/月)
  7 (Python开发工程师, https://jobs.51job.com/shanghai-xhq/119981428.html?s=01&t=0, 1-1.5万/月)
  8 (python开发工程师/大数据建模, https://jobs.51job.com/shanghai-hpq/114718480.html?s=01&t=0, 6-8千/月)
  9 (Python开发工程师, https://jobs.51job.com/shanghai/120395604.html?s=01&t=0, 1.2-1.5万/月)
 10 (C/C++/Python开发工程师, https://jobs.51job.com/shanghai-xhq/120909208.html?s=01&t=0, 15-20万/年)
 11 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/89716807.html?s=01&t=0, 1-1.5万/月)
 12 (erlang/python服务器开发工程师, https://jobs.51job.com/shanghai-xhq/98416948.html?s=01&t=0, 1.5-3.5万/月)
 13 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/120657799.html?s=01&t=0, 1.5-2万/月)
 14 (Python开发工程师 (MJ000231), https://jobs.51job.com/shanghai-jaq/117808653.html?s=01&t=0, 0.6-1万/月)
 15 (python开发工程师-A0122, https://jobs.51job.com/shanghai-jaq/119864919.html?s=01&t=0, 1-1.5万/月)
 16 (高级python后端工程师(AI平台), https://jobs.51job.com/shanghai-ypq/120959109.html?s=01&t=0, 1.5-2万/月)
 17 (初级Python工程师, https://jobs.51job.com/shanghai-pdxq/116357032.html?s=01&t=0, 10-15万/年)
 18 (Python开发工程师, https://jobs.51job.com/shanghai/115980776.html?s=01&t=0, ‘‘)
 19 (python数据分析, https://jobs.51job.com/shanghai-bsq/120583326.html?s=01&t=0, 1.5-2.5万/月)
 20 (Python开发工程师, https://jobs.51job.com/hefei/121179694.html?s=01&t=0, 0.8-1.2万/月)
 21 (Python开发专家, https://jobs.51job.com/shanghai-mhq/121173046.html?s=01&t=0, 3-3.5万/月)
 22 (Python开发工程师, https://jobs.51job.com/shanghai-jaq/120911076.html?s=01&t=0, 1-1.5万/月)
 23 (python Web开发工程师, https://jobs.51job.com/shanghai-pdxq/120283479.html?s=01&t=0, 1-1.3万/月)
 24 (python web后台开发, https://jobs.51job.com/shanghai-cnq/119461330.html?s=01&t=0, 0.6-1.5万/月)
 25 (Python开发工程师(金融科技), https://jobs.51job.com/shanghai-xhq/117251350.html?s=01&t=0, 0.7-1.5万/月)
 26 (Python开发工程师, https://jobs.51job.com/shanghai-ypq/107196999.html?s=01&t=0, 0.8-1.5万/月)
 27 (Python高级软件工程师, https://jobs.51job.com/shanghai-mhq/120966514.html?s=01&t=0, 21-35万/年)
 28 (P0076-python开发工程师, https://jobs.51job.com/shanghai-pdxq/121163335.html?s=01&t=0, 1-2万/月)
 29 (Python 应用开发工程师, https://jobs.51job.com/shanghai-cnq/121160736.html?s=01&t=0, 0.7-1.3万/月)
 30 (软件开发工程师(GO/Lua/python), https://jobs.51job.com/shanghai-jaq/119529417.html?s=01&t=0, 1.5-2.5万/月)
 31 (运维开发工程师(Python开发), https://jobs.51job.com/shanghai/119187608.html?s=01&t=0, 2.5-4万/月)
 32 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/98528369.html?s=01&t=0, 1-1.5万/月)
 33 (Python开发工程师, https://jobs.51job.com/shanghai-jaq/120386104.html?s=01&t=0, 1-2.2万/月)
 34 (Python开发工程师, https://jobs.51job.com/shanghai-mhq/118338654.html?s=01&t=0, 0.8-1万/月)
 35 (PFS-Python Developer, http://durrgroup.51job.com/jobinfo1.html?id=120187451, ‘‘)
 36 (python数据分析师, https://jobs.51job.com/shanghai-pdxq/112471902.html?s=01&t=0, 1-1.5万/月)
 37 (25923-Python高级工程师(深圳), https://jobs.51job.com/shanghai-hpq/118208611.html?s=01&t=0, ‘‘)
 38 (硬件工程师(python), https://jobs.51job.com/shanghai/120159337.html?s=01&t=0, ‘‘)
 39 (Python开发工程师, https://jobs.51job.com/shanghai-ypq/121106633.html?s=01&t=0, 1.3-1.9万/月)
 40 (Python后端开发工程师, https://jobs.51job.com/shanghai-xhq/118007556.html?s=01&t=0, 1.2-1.8万/月)
 41 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/120944265.html?s=01&t=0, 1-1.5万/月)
 42 (Python全栈工程师, https://jobs.51job.com/shanghai-mhq/109925368.html?s=01&t=0, 0.8-1.6万/月)
 43 (python开发, https://jobs.51job.com/shanghai-hkq/117623816.html?s=01&t=0, 2-3万/月)
 44 (高级Python/Django后端软件工程师, https://jobs.51job.com/shanghai-cnq/109764789.html?s=01&t=0, 1.5-2万/月)
 45 (软件工程师(汽车行业优先)精通python, https://jobs.51job.com/shanghai-pdxq/120189871.html?s=01&t=0, 0.8-2万/月)
 46 (Python 爬虫工程师(薪智), https://jobs.51job.com/shanghai-mhq/119329837.html?s=01&t=0, 1.5-2万/月)
 47 (高级Python开发工程师, https://jobs.51job.com/shanghai/105294644.html?s=01&t=0, 2.5-5万/月)
 48 (Python开发工程师, https://jobs.51job.com/shanghai/101535573.html?s=01&t=0, 1.5-2万/月)
 49 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/115644674.html?s=01&t=0, 1.5-2.2万/月)
 50 (Python开发经理, https://jobs.51job.com/shanghai-pdxq/114043106.html?s=01&t=0, 2-2.5万/月)
 51 (Python高级开发工程师, https://jobs.51job.com/shanghai-pdxq/118673386.html?s=01&t=0, 1.8-3万/月)
 52 (高级Python开发工程师, https://jobs.51job.com/shanghai-mhq/98639401.html?s=01&t=0, 1.5-2.5万/月)
 53 (python开发, https://jobs.51job.com/shanghai-hkq/117622468.html?s=01&t=0, 2-3万/月)
 54 ----------------第2页-----------------
 55 (python工程师, https://jobs.51job.com/shanghai-sjq/120620177.html?s=01&t=0, 0.8-2万/月)
 56 (Python(Odoo)工程师, https://jobs.51job.com/shanghai-pdxq/119746269.html?s=01&t=0, 1-1.5万/月)
 57 (Python/Odoo高级开发工程师, https://jobs.51job.com/shanghai/116881344.html?s=01&t=0, 2.5-3万/月)
 58 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/116927659.html?s=01&t=0, 0.8-1.5万/月)
 59 (Python开发工程师, https://jobs.51job.com/shanghai-cnq/120895491.html?s=01&t=0, 1-1.5万/月)
 60 (Python高级开发工程师, https://jobs.51job.com/shanghai-pdxq/120349450.html?s=01&t=0, 1.5-2万/月)
 61 (中级后端工程师(python/odoo), https://jobs.51job.com/shanghai-sjq/107147430.html?s=01&t=0, 1.5-2万/月)
 62 (实习生(Python开发), https://jobs.51job.com/shanghai/119951678.html?s=01&t=0, 1-5千/月)
 63 (Python开发工程师, https://jobs.51job.com/shanghai-ypq/120474003.html?s=01&t=0, 0.8-1.2万/月)
 64 (Python后端开发, https://jobs.51job.com/shanghai-xhq/121023917.html?s=01&t=0, 1.5-2.5万/月)
 65 (Python软件工程师, https://jobs.51job.com/shanghai-xhq/117038800.html?s=01&t=0, 1-1.5万/月)
 66 (初级python/R 工程师, https://jobs.51job.com/shanghai-xhq/116055667.html?s=01&t=0, 0.5-1万/月)
 67 (python, https://jobs.51job.com/shanghai-pdxq/120778793.html?s=01&t=0, 6.5-9.5千/月)
 68 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/120941553.html?s=01&t=0, 0.8-1.5万/月)
 69 (python开发项目经理, https://jobs.51job.com/shanghai-ypq/117944790.html?s=01&t=0, 2.6-4万/月)
 70 (Python/PHP后端程序员, https://jobs.51job.com/shanghai-xhq/114897348.html?s=01&t=0, 1-1.5万/月)
 71 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/119493296.html?s=01&t=0, 15-25万/年)
 72 (Python开发工程师(外汇岗位), https://jobs.51job.com/shanghai-pdxq/120797741.html?s=01&t=0, 1-1.6万/月)
 73 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/112330285.html?s=01&t=0, 1-1.7万/月)
 74 (Python开发工程师(***), https://jobs.51job.com/shanghai/120694247.html?s=01&t=0, 1000元/天)
 75 (python开发, https://jobs.51job.com/shanghai-hkq/117614339.html?s=01&t=0, 2-3万/月)
 76 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/118308129.html?s=01&t=0, 1.1-2万/月)
 77 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/119981957.html?s=01&t=0, 1.5-2万/月)
 78 (Python开发工程师, https://jobs.51job.com/shenzhen/118443903.html?s=01&t=0, 3-4万/月)
 79 (Python高级开发工程师, https://jobs.51job.com/shanghai-sjq/120173104.html?s=01&t=0, 1.5-2万/月)
 80 (python 工程师, https://jobs.51job.com/shanghai-xhq/120570442.html?s=01&t=0, 1.5-2万/月)
 81 (Python高级开发工程师, https://jobs.51job.com/shanghai-mhq/120105386.html?s=01&t=0, 1.5-3万/月)
 82 (软件开发工程师(Python), https://jobs.51job.com/shenzhen-nsq/118492627.html?s=01&t=0, 0.6-1.2万/月)
 83 (python开发, https://jobs.51job.com/shanghai-pdxq/118924590.html?s=01&t=0, 6-8千/月)
 84 (Python开发工程师, https://jobs.51job.com/shanghai-xhq/115691023.html?s=01&t=0, 1.3-1.8万/月)
 85 (Python工程师, https://jobs.51job.com/shanghai-cnq/119488897.html?s=01&t=0, 1.5-2万/月)
 86 (python开发, https://jobs.51job.com/shanghai-xhq/115310459.html?s=01&t=0, 6-8千/月)
 87 (大数据算法开发/Python开发, https://jobs.51job.com/shanghai-pdxq/120055740.html?s=01&t=0, 1.5-2万/月)
 88 (Python开发(09), https://jobs.51job.com/shanghai-pdxq/120786708.html?s=01&t=0, 0.8-1.1万/月)
 89 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/120472220.html?s=01&t=0, 1-1.5万/月)
 90 (Python工程师, https://jobs.51job.com/shanghai/119281603.html?s=01&t=0, 2-3.5万/月)
 91 (Python高级开发工程师, https://jobs.51job.com/shanghai-hkq/119893009.html?s=01&t=0, 1.5-2万/月)
 92 (Python运维开发工程师, https://jobs.51job.com/shanghai-mhq/119681490.html?s=01&t=0, 1.5-2万/月)
 93 (Python工程师, https://jobs.51job.com/shanghai/102367533.html?s=01&t=0, 1.5-2万/月)
 94 (Python开发工程师, https://jobs.51job.com/shanghai-ypq/120946896.html?s=01&t=0, 1.5-2万/月)
 95 (Python开发工程师, https://jobs.51job.com/shanghai-hpq/109016702.html?s=01&t=0, 2.3-2.8万/月)
 96 (Senior Python Software Engineer, https://jobs.51job.com/shanghai/119066163.html?s=01&t=0, 1.5-3万/月)
 97 (高级软件工程师  Golang/Python, https://jobs.51job.com/shanghai-cnq/120220700.html?s=01&t=0, 2-6万/月)
 98 (Python开发工程师, https://jobs.51job.com/shanghai/114088389.html?s=01&t=0, 1-1.5万/月)
 99 (Python 开发工程师, https://jobs.51job.com/shanghai-pdxq/120139836.html?s=01&t=0, 0.8-1.5万/月)
100 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/115129775.html?s=01&t=0, 1.5-2万/月)
101 (python开发, https://jobs.51job.com/shanghai-hkq/117611952.html?s=01&t=0, 2-3万/月)
102 (Python开发工程师, https://jobs.51job.com/shanghai-pdxq/110125924.html?s=01&t=0, 1-1.5万/月)
103 (Python开发工程师, https://jobs.51job.com/shanghai-xhq/118610557.html?s=01&t=0, 1-2万/月)
104 (Python 架构, https://jobs.51job.com/shanghai-pdxq/120317481.html?s=01&t=0, 2.5-3.5万/月)
View Code

 

Python爬虫-简单利用urllib.request和正则表达式抓取职位信息

原文:https://www.cnblogs.com/Jeffrey-xu/p/12657909.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!