代码之家  ›  专栏  ›  技术社区  ›  austin0896

如何循环BeautifulSoup的URL输出?

  •  0
  • austin0896  · 技术社区  · 4 年前

    如果工作正常,输出结果将是:

    Greg Oden C  #20
    Born: Jan 22, 1988 (33 years old)
    Birthplace/Hometown: Buffalo, New York
    Nationality: United States
    Height: 7-0 (213cm)     Weight: 273 (124kg)
    Website: http://www.gregoden52.com/
    Current NBA Status: Unrestricted Free Agent
    Agent: Bill Duffy
    Draft Entry: 2007 NBA Draft
    Early Entry Info: 2007 Early Entrant
    Drafted: Round 1, Pick 1, Portland Trail Blazers
    Pre-Draft Team: Ohio State (Fr)
    High School: Lawrence North High School [Indianapolis, Indiana]
    AAU Team: Spiece Indy Heat
    
    Carl Landry F
    Current Team: N/A
    Born: Sep 19, 1983 (37 years old)
    Birthplace/Hometown: Milwaukee, Wisconsin  
    Nationality: United States
    Height: 6-9 (206cm)     Weight: 248 (112kg)
    Hand: Right
    Website: https://carllandry.com/
    @CarlLandry
    Current NBA Status: Unrestricted Free Agent
    Agent: Mark Bartelstein, Reggie Brown
    Draft Entry: 2007 NBA Draft
    Drafted: Round 2, Pick 1, Seattle SuperSonics
    Draft Rights Trade: SEA to HOU, Jun 28, 2007
    Pre-Draft Team: Purdue (Sr)
    High School: Vincent High School [Milwaukee, Wisconsin]
    

    import csv ;import requests
    from bs4 import BeautifulSoup
    import csv
    import re
    
    url_list = ['https://basketball.realgm.com/player/player/Summary/1',
                'https://basketball.realgm.com/player/player/Summary/2']
    
    for url in url_list:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
    
    player = soup.find_all('div', class_= 'wrapper clearfix container')[0]
    
    
    playerprofile = re.sub(r'\n\s*\n', r'\n', player.get_text().strip(), flags=re.M)
    
    print(playerprofile)
    
    1 回复  |  直到 4 年前
        1
  •  0
  •   Dharman Aman Gojariya    4 年前
    import csv
    import requests
    from bs4 import BeautifulSoup
    import csv
    import re
    
    url_list = ['https://basketball.realgm.com/player/player/Summary/1',
                'https://basketball.realgm.com/player/player/Summary/2']
    
    for url in url_list:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, 'html.parser')
    
        player = soup.find_all('div', class_='wrapper clearfix container')[0]
    
        playerprofile = re.sub(
            r'\n\s*\n', r'\n', player.get_text().strip(), flags=re.M)
    
        print(playerprofile + "\n")
    

    这段代码的工作方式如您所需的输出所示,似乎代码中播放器的解析和打印是在循环完成之后进行的。它应该为循环的每个迭代执行,所以您可以将它缩进循环中。