代码之家  ›  专栏  ›  技术社区  ›  Paul Reiners

pd.read\U csv文件解析错误

  •  1
  • Paul Reiners  · 技术社区  · 7 年前

    我有一个CSV文件,如下所示:

    Date,Time,Mood,Tags,Medications,Notes
    "Jul 25, 2018",9:41 PM,8,,,"",
    "Jul 26, 2018",10:05 AM,4,,,"",
    "Jul 26, 2018",12:00 PM,3,,,"",
    "Jul 26, 2018",7:00 PM,8,,,"",
    "Jul 27, 2018",12:01 PM,8,,,"",
    

    我运行以下代码:

    import pandas as pd
    
    df = pd.read_csv("./data/MoodLog_2018_09_14.csv", 
                     dtype={'Date': str, 'Time': str, 'Mood': str, 'Tags': str, 
                            'Medications': str, 'Notes': str})
    
    print(df['Time'].head(5))
    

    并打印以下内容:

    Jul 25, 2018    8
    Jul 26, 2018    4
    Jul 26, 2018    3
    Jul 26, 2018    8
    Jul 27, 2018    8
    Name: Time, dtype: object
    

    它包括 Mood 中的列 Time

    为什么?

    1 回复  |  直到 7 年前
        1
  •  1
  •   ALollz    7 年前

    , ,而标头不存在。将标题更改为: Date,Time,Mood,Tags,Medications,Notes, ,您将得到一个额外的列,然后可以删除它。

    输入: test.csv

    Date,Time,Mood,Tags,Medications,Notes,
    "Jul 25, 2018",9:41 PM,8,,,"",
    "Jul 26, 2018",10:05 AM,4,,,"",
    "Jul 26, 2018",12:00 PM,3,,,"",
    "Jul 26, 2018",7:00 PM,8,,,"",
    "Jul 27, 2018",12:01 PM,8,,,"",
    

    df = pd.read_csv("test.csv", 
                     dtype={'Date': str, 'Time': str, 'Mood': str, 'Tags': str, 
                            'Medications': str, 'Notes': str}).iloc[:, :-1]
    

    df

               Date      Time Mood Tags Medications Notes
    0  Jul 25, 2018   9:41 PM    8  NaN         NaN   NaN
    1  Jul 26, 2018  10:05 AM    4  NaN         NaN   NaN
    2  Jul 26, 2018  12:00 PM    3  NaN         NaN   NaN
    3  Jul 26, 2018   7:00 PM    8  NaN         NaN   NaN
    4  Jul 27, 2018  12:01 PM    8  NaN         NaN   NaN