首页 > 微信 > 详细

python 爬取微信公众号文章

时间:2021-09-02 14:18:21      阅读:43      评论:0      收藏:0      [点我收藏+]
# coding:utf-8
import requests
import json
import urllib3
import time

urllib3.disable_warnings()
# 请求地址
url = ‘https://mp.weixin.qq.com/mp/profile_ext?‘
# 添加请求头
headers = {
    ‘User-Agent‘: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36‘
}
# 请求参数
for i in range(0, 100):
    print(‘开始抓取第%s页数据‘ % (i + 1))
    params = {
        ‘__biz‘: ‘MzkyMzE3NDE0Mw==‘,
        ‘uin‘: ‘MTM0MDY0NDYzNw%3D%3D‘,
        ‘key‘: ‘fb814ad12288401f8a3a5728ee746dac164904e28802395f07198592d01ab4f1301b254253dafc981102cbe7b41a0633c98dd007262124ba7bfdf590c0e7c629ca5860466d041a6156f8e0484ed971338fa5a65b10c86568bbf3e50c23692d0ad26a7ccad641290847dc1e601bada55dfab5c3441c873c6d9b5daf1b0465bbd2‘,
        ‘offset‘: (i * 10),
        ‘count‘: 10,
        ‘action‘: ‘getmsg‘,
        ‘f‘: ‘json‘
    }
    # 获取请求的json格式
    r = requests.get(url, headers=headers, params=params, verify=False).json()
    msg_list = json.loads(r[‘general_msg_list‘])
    list = msg_list.get(‘list‘)
    for i in list:
        info_list = i[‘app_msg_ext_info‘]
        # 获取标题
        title = info_list[‘title‘]
        print(title)
        # 链接
        content_url = info_list[‘content_url‘]
        print(content_url)
        # 发布时间
        datetime_list = i[‘comm_msg_info‘][‘datetime‘]
        datetime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(datetime_list))
        print(datetime)

 

python 爬取微信公众号文章

原文:https://www.cnblogs.com/luweiweicode/p/15217609.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!