python抓取页面内容

用urllib.request获取html内容,再用BeautifulSoup提取其中的数据,完成一次简单的爬取。getone.find_all获取  a.mnav标签如图

 

from urllib.request import urlopen

from bs4 import BeautifulSoup
html=urlopen(‘http://www.baidu.com’)

getone=BeautifulSoup(html.read(),’html.parser’)

test_list=getone.find_all(‘a’, ‘mnav’)

for test in test_list:    

print (test.get_text())

html.close()

About sun 83 Articles
85后青年,自诩为伪文艺青年

Be the first to comment

Leave a Reply

Your email address will not be published.


*