首页 > 其他 > 详细

13 Beautiful Soup库的基本元素

时间:2020-06-04 22:32:36      阅读:51      评论:0      收藏:0      [点我收藏+]

技术分享图片

技术分享图片

技术分享图片

举例:


"""Beautiful Soup库的基本元素"""


import requests
from bs4 import BeautifulSoup

url = "https://python123.io/ws/demo.html"
r = requests.get(url)
demo = r.text
soup = BeautifulSoup(demo, "html.parser")
#print(soup.prettify())

# <title>This is a python demo page</title>
print(soup.title)

tag = soup.a
# <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a>
print(tag)
# a
print(soup.a.name)
# p
print(soup.a.parent.name)
# body
print(soup.a.parent.parent.name)
# html
print(soup.a.parent.parent.parent.name)
# [document]
print(soup.a.parent.parent.parent.parent.name)
# {‘href‘: ‘http://www.icourse163.org/course/BIT-268001‘, ‘class‘: [‘py1‘], ‘id‘: ‘link1‘}
print(tag.attrs)
# [‘py1‘]
print(tag.attrs[‘class‘])
# http://www.icourse163.org/course/BIT-268001
print(tag.attrs[‘href‘])
# <class ‘dict‘>
print(type(tag.attrs))
# <class ‘bs4.element.Tag‘>
print(type(tag))


# Basic Python
print(soup.a.string)
# <p class="title"><b>The demo python introduces several python courses.</b></p>
print(soup.p)
# The demo python introduces several python courses.
print(soup.p.string)
# <class ‘bs4.element.NavigableString‘>
print(type(soup.p.string))


# HTML注释(comment)的类型
"""Beautiful Soup库的基本元素"""


import requests
from bs4 import BeautifulSoup

# HTML注释(comment)的类型
newHTML = "<b><!--This is a comment--></b><p>This is not a comment</p>"
newsoup = BeautifulSoup(newHTML, "html.parser")
# This is a comment
print(newsoup.b.string)
# This is not a comment
print(newsoup.p.string)
# <class ‘bs4.element.Comment‘>
print(type(newsoup.b.string))
# <class ‘bs4.element.NavigableString‘>
print(type(newsoup.p.string))

 

13 Beautiful Soup库的基本元素

原文:https://www.cnblogs.com/sruzzg/p/13046881.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!