代码之家  ›  专栏  ›  技术社区  ›  AG1

蟒蛇3:如何打印groupby.last()?

  •  0
  • AG1  · 技术社区  · 6 年前
    $ cat n2.txt
    apn,date
    3704-156,11/04/2019
    3704-156,11/22/2019
    5515-004,10/23/2019
    3732-231,10/07/2019
    3732-231,11/15/2019
    
    $ python3
    Python 3.7.5 (default, Oct 25 2019, 10:52:18) 
    [Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import pandas as pd 
    >>> df = pd.read_csv("n2.txt")
    >>> df
            apn        date
    0  3704-156  11/04/2019
    1  3704-156  11/22/2019
    2  5515-004  10/23/2019
    3  3732-231  10/07/2019
    4  3732-231  11/15/2019
    >>> g = df.groupby('apn')
    >>> g.last()
                    date
    apn                 
    3704-156  11/22/2019
    3732-231  11/15/2019
    5515-004  10/23/2019
    >>> f = g.last()
    
    >>> for r in f.itertuples(index=True, name='Pandas'):
    ...     print(getattr(r,'apn'), getattr(r,'date'))
    ... 
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    AttributeError: 'Pandas' object has no attribute 'apn'
    
    >>> for r in f.itertuples(index=True, name='Pandas'):
    ...     print(getattr(r,"apn"), getattr(r,"date"))
    ... 
    Traceback (most recent call last):
      File "<stdin>", line 2, in <module>
    AttributeError: 'Pandas' object has no attribute 'apn'
    
    

    如何正确地将此打印到文件中?

    apn, date
    3704-156,11/22/2019
    3732-231,11/15/2019
    5515-004,10/23/2019
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   Travis    6 年前
    df = pd.read_csv("n2.txt")
    g = df.groupby('apn').last()
    print(g.to_csv())
    

    如果你打字 g.to_csv() 'apn,data,\r\n...' . 以及 print 函数在遇到时将开始一个新行 '\r\n' ,最终输出如您所愿。

        2
  •  0
  •   jezrael    6 年前

    您的代码应该更改:

    df = pd.read_csv("n2.txt")
    g = df.groupby('apn')
    f = g.last()
    

    使用 Series.to_csv 因为 f 是熊猫吗 Series :

    f.to_csv(file)
    

    或使用 DataFrame.to_csv 带转换 index DataFrame :

    f.reset_index().to_csv(file, index=False)
    

    或使用溶液 DataFrame.drop_duplicates

    df = pd.read_csv("n2.txt")
    df = df.drop_duplicates('apn', keep='last')
    df.to_csv(file, index=False)
    

    在您的解决方案中使用 Index 选择 属于 :

    for r in f.itertuples(index=True, name='Pandas'):
        print(getattr(r,'Index'), getattr(r,'date'))
    3704-156 11/22/2019
    3732-231 11/15/2019
    5515-004 10/23/2019