https://www.brightscope.com/ratings/a
收视率很高
other
. 评级后的每个字母(如a、b、c……)都有多页。我正在尝试创建一个while循环来转到每个页面,并且在存在某个条件时,将所有的href(我还没有得到该代码)。但是,当我运行代码时,while循环继续不停地运行。如何修复它以转到每个页面并搜索要运行的条件,如果找不到,则转到下一个字母?在任何人可能会问,我已经搜索了代码,但没有看到任何
li
https://www.brightscope.com/ratings/A/18
是最高的,它将去为A的,但它继续运行。
import requests
from bs4 import BeautifulSoup
url = "https://www.brightscope.com/ratings/"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
hrefs = []
ratings = []
ks = []
pages_scrape = []
for href in soup.findAll('a'):
if 'href' in href.attrs:
hrefs.append(href.attrs['href'])
for good_ratings in hrefs:
if good_ratings.startswith('/ratings/'):
ratings.append(url[:-9]+good_ratings)
del ratings[0]
del ratings[27:]
count = 1
# So it runs each letter a, b, c, ...
for each_rating in ratings:
#Pulls the page
page = requests.get(each_rating)
#Does its soup thing
soup = BeautifulSoup(page.text, 'html.parser')
#Supposed to stay in A, B, C,... until it can't find the 'li' tag
while soup.find('li'):
page = requests.get(each_rating+str(count))
print(page.url)
count = count+1
#Keeps running this and never breaks
else:
count = 1
break