代码之家  ›  专栏  ›  技术社区  ›  Dalton Cézane

csv.reader在字段名中返回“OrderedDict”值

  •  2
  • Dalton Cézane  · 技术社区  · 7 年前

    我正在编写一个脚本,负责从.csv文件中读取一些值,然后将它们写入另一个.csv文件。

    header = ["Title", "Authors", "Year", "Abstract", "Keywords"]
    
    fields_number = int(input("Enter the number of fields you want to get: "))
    
    field_names = list()
    field_values = list()
    for i in range(0, fields_number):
        field_name = input("Enter the field name: ")
        field_names.append(field_name)
    
    try:
        with open(filename) as csvfile:
            rowsreader = csv.DictReader(csvfile)
            for row in rowsreader:
                print(row)
                json_row = '{'
                for i in range(0, len(field_names)):
                    field = field_names[i]
                    json_row += '"{}":"{}"'.format(header[i], row[field])
                    json_row += ',' if (i < len(field_names) - 1) else '}'
                field_values.append(json.loads(json_row))
    except IOError:
        print("Could not open csv file: {}.".format(filename))
    

    我得到以下输出:

     Traceback (most recent call last):
      File "slr_helper.py", line 58, in <module>
        main()
      File "slr_helper.py", line 37, in main
        json_row += '"{}":"{}"'.format(header[i], row[field])
    KeyError: 'Authors'
    

    Authors,Author Ids,Title,Year,Source title,Volume,Issue,Art. No.,Page start,Page end,Page count,Cited by,DOI,Link,Abstract,Author Keywords,Index Keywords,Sponsors,Publisher,Conference name,Conference date,Conference location,Conference code,Document Type,Access Type,Source,EID
    "AlHogail A., AlShahrani M.","51060982200;57202888364;","Building consumer trust to improve Internet of Things (IoT) technology adoption",2019,
    

    但在读取csv文件时,代码会打印以下内容:

    OrderedDict([('\ufeffAuthors', 'AlHogail A., AlShahrani M.'), ('Author Ids', '51060982200;57202888364;'),...
    

    我想知道如何避免这种情况 OrderedDict([('\ufeff ,因为它导致了我得到的错误。

    1 回复  |  直到 7 年前
        1
  •  4
  •   Sasha Tsukanov    7 年前

    作为 juanpa.arrivillaga 他指出, \ufeff 是字节顺序标记(BOM)。它位于文件的开头,即 permitted enter image description here

    默认情况下,python3打开文件时 encoding='utf-8' ,它不会将BOM与其他代码点区别对待,并将其作为文本内容进行读取。我们需要将编码指定为 'utf-8-sig' 要改变这一点:

    with open(filename, encoding='utf-8-sig') as csvfile:
    

    顺便说一下,如果你在Linux上,你可以使用 file ${filename}