昨天晚上无聊时,想着练习一下Python所以写了一个小爬虫获取小刀娱乐网里的更新数据
-
- import urllib.request
- import re
- head = "www.xiaodao.la"
- def get():
- data = urllib.request.urlopen(‘http://www.xiaodao.la‘).read()
-
- str = data.decode("gbk").replace(r"font-weight:bold;","").replace(r" ","").replace(" ","").replace(" ","").replace("\r\n","").replace("
- return str[str.find("好卡售"):str.find("20160303184868786878.gif")]
- str = get();
-
- reg = r‘href="(.*?)"style="color:#000000;"title="(.*?)"target="_blank">(.*?)</a></div></td><tdwidth=12.5%align=rightnowrap=nowrapstyle="color:#F00;">(.*?)</td>‘
-
- tmp = re.compile(reg);
- list = re.findall(tmp,str);
- list = tuple(list)
-
- print("一共匹配到%d个"%(len(list)))
-
- for i in range(len(list)):
- print("当前第%d个:"%(i+1))
- print("标题:%s\n地址:%s更新时间:%s\n"%(list[i][1],head + list[i][0],list[i][3]))
Python实现简单的爬虫获取某刀网的更新数据
原文:https://www.cnblogs.com/zxtceq/p/8985732.html