代码之家  ›  专栏  ›  技术社区  ›  Reonard1

Web报废中的AttributeError

  •  0
  • Reonard1  · 技术社区  · 1 年前
    import requests
    from bs4 import BeautifulSoup
    import csv
    
    url = "https://"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    
    data = []
    for tag in soup.find_all(['div', 'table']):
        if tag.name == 'div' and len(tag.find_all('table')) > 0:
            for subtag in tag.find_all('table'):
                data.append(subtag.text.strip())
    
    with open('output.csv', 'w') as file:
        writer = csv.writer(file)
        for item in data:
            writer.writerow([item])
            writer.writerow([item])
            writer.writerow([item])
    
    try:
        with open('output.txt', 'r') as file:
            contents = file.read()
            print(contents)
    except FileNotFoundError:
        print("File not found")
    except Exception as e:
        print(f"An error occurred: {e}")
    

    代码试图访问每个标记的name属性,但当遇到NavigableString时,会引发错误。

    1 回复  |  直到 1 年前
        1
  •  0
  •   Karol    1 年前

    出现此错误的原因是标记变量是NavigableString对象,这是一种特殊类型的BeautifulSoup对象,表示HTML文档中的文本字符串。这种类型的对象没有name属性,这就是出现错误的原因。 要解决此问题,在尝试访问其name属性之前,需要检查标记变量是否为实际的HTML标记。您可以使用if语句执行此操作:

    import requests
    from bs4 import BeautifulSoup
    import csv
    url = "https://www.example.com"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, 'html.parser')
    data = []
    for tag in soup.find_all(['div', 'table']):
        if isinstance(tag, str):  # Check if tag is a string
            continue  # Skip this iteration
        if tag.name == 'table':
            for subtag in tag.find_all('tr'):
                data.append(subtag.text.strip())
    with open('output.csv', 'w') as file:
        writer = csv.writer(file)
        for item in data:
            writer.writerow([item])
            writer.writerow([item])
            writer.writerow([item])
    try:
        with open('output.txt', 'r') as file:
            contents = file.read()
            print(contents)
    except FileNotFoundError:
        print("File not found")
    except Exception as e:
        print(f"An error occurred: {e}")
    
    推荐文章