代码之家  ›  专栏  ›  技术社区  ›  Sayeed

刮网;无法传递字典中的所有项

  •  1
  • Sayeed  · 技术社区  · 7 年前

    我从一个网站上提取了9个项目。一切都很好,但当我试图将这些条目传递到字典中时,只有最后一个条目被保存到字典中。

    import requests
    from bs4 import BeautifulSoup
    base_url="http://cbcs.fastvturesults.com/student/1sp15me00"
    d={}
    for page in range(1,10,1):
        r=requests.get(base_url+str(page))
        c=r.content
        soup=BeautifulSoup(c,"html.parser")
        items=soup.find(class_="text-muted")
        if items:
            d["Name"]=items.previous_sibling
            d["USN"]=items.text.replace("(","").replace(")","")
    d
    

    如何将所有项目保存到字典中?

    2 回复  |  直到 7 年前
        1
  •  1
  •   Rakesh    7 年前

    使用字典列表来存储数据。

    演示:

    import requests
    from bs4 import BeautifulSoup
    base_url="http://cbcs.fastvturesults.com/student/1sp15me00"
    res = []
    for page in range(1,10,1):
        r=requests.get(base_url+str(page))
        c=r.content
        soup=BeautifulSoup(c,"html.parser")
        items=soup.find(class_="text-muted")
        if items:
            res.append({"Name": items.previous_sibling, "USN": items.text.replace("(","").replace(")","")})
    print(res)
    

    输出:

    [{'USN': u'1sp15me001', 'Name': u'Agnello Fernandes A '}, {'USN': u'1sp15me002', 'Name': u'Ajay Kumar V '}, {'USN': u'1sp15me003', 'Name': u'Ajay Rajendiran '}, {'USN': u'1sp15me004', 'Name': u'Amit Singh Yadav '}, {'USN': u'1sp15me006', 'Name': u'Ankit Mahato '}, {'USN': u'1sp15me008', 'Name': u'Antony Levin Fernandez D '}, {'USN': u'1sp15me009', 'Name': u'Ashish S '}]
    
        2
  •  1
  •   SIM    7 年前

    或者你开始的方式可能会以如下方式结束:

    import requests
    from bs4 import BeautifulSoup
    
    base_url="http://cbcs.fastvturesults.com/student/1sp15me00{}"
    
    data = []
    
    for page in range(1,10,1):
        d = {}
        r = requests.get(base_url.format(page))
        soup = BeautifulSoup(r.content,"html.parser")
        items = soup.find(class_="text-muted")
        if items:
            d["Name"] = items.previous_sibling
            d["USN"] = items.text.replace("(","").replace(")","")
            data.append(d)
    
    print(data)
    

    输出:

    [{'Name': 'Agnello Fernandes A ', 'USN': '1sp15me001'}, {'Name': 'Ajay Kumar V ', 'USN': '1sp15me002'}, {'Name': 'Ajay Rajendiran ', 'USN': '1sp15me003'}, {'Name': 'Amit Singh Yadav ', 'USN': '1sp15me004'}, {'Name': 'Ankit Mahato ', 'USN': '1sp15me006'}, {'Name': 'Antony Levin Fernandez D ', 'USN': '1sp15me008'}, {'Name': 'Ashish S ', 'USN': '1sp15me009'}]