代码之家  ›  专栏  ›  技术社区  ›  sayth

尝试基于类列表对象链接选择元素-BeautifulSoup

  •  0
  • sayth  · 技术社区  · 7 年前

    我用的是漂亮的汤4.4和python 3.6.6。我已经提取了所有链接,但是我无法打印出包含

    'class':[''u self']

    这是从链接列表中获取的完整链接。

    {'href': 'https://www.racingnsw.com.au/news/latest-racing-news/highway-sixtysix-on-right-route/', 'class': ['_self'], 'target': '_self'}
    

    虽然它看起来像上的BS4文档,但我无法获得正确的语法。 attributes .

    import requests as req
    import json
    from bs4 import BeautifulSoup
    
    url = req.get(
        'https://www.racingnsw.com.au/media-news-premierships/latest-news/')
    
    data = url.content
    
    soup = BeautifulSoup(data, "html.parser")
    
    links = soup.find_all('a')
    
    for item in links:
        print(item['class']='self')
    
    1 回复  |  直到 7 年前
        1
  •  3
  •   Pruthvi Kumar    7 年前

    import requests as req
    from bs4 import BeautifulSoup
    
    url = req.get(
        'https://www.racingnsw.com.au/media-news-premierships/latest-news/')
    
    data = url.content
    
    soup = BeautifulSoup(data, "html.parser")
    
    for items in soup.select('a[class*="_self"]'):
        print(items)