代码之家  ›  专栏  ›  技术社区  ›  ebrahimi

如何将html表读取为数据帧(urllib.error.URLError:<urlopen error unknown url type:https>)?

  •  0
  • ebrahimi  · 技术社区  · 6 年前

    如果您能告诉我如何将html表转换成数据帧,我将不胜感激。

    import pandas as pd
    df = pd.read_html('https://www.iasplus.com/en/resources/ifrs-topics/use-of-ifrs', header = None)
    

    错误:

    C:\Users\t\Anaconda3\python.exe C:/Users/t/Downloads/hyperopt12.py
    Traceback (most recent call last):
      File "C:/Users/t/Downloads/hyperopt12.py", line 12, in <module>
        df = pd.read_html('https://www.iasplus.com/en/resources/ifrs-topics/use-of-ifrs', header = None)
      File "C:\Users\t\Anaconda3\lib\site-packages\pandas\io\html.py", line 1094, in read_html
        displayed_only=displayed_only)
      File "C:\Users\t\Anaconda3\lib\site-packages\pandas\io\html.py", line 916, in _parse
        raise_with_traceback(retained)
      File "C:\Users\t\Anaconda3\lib\site-packages\pandas\compat\__init__.py", line 420, in raise_with_traceback
        raise exc.with_traceback(traceback)
    urllib.error.URLError: <urlopen error unknown url type: https>
    

    1 回复  |  直到 6 年前
        1
  •  0
  •   run-out    6 年前

    你需要在页面上找到合适的表格来阅读。read\u html返回数据帧对象的列表。请参阅文档 here

    import pandas as pd
    tables = pd.read_html('https://www.iasplus.com/en/resources/ifrs-topics/use-of-ifrs', header = None)
    df = tables[2]
    df