代码之家  ›  专栏  ›  技术社区  ›  It_is_Chris

熊猫转换(“唯一”)输出为逗号分隔字符串,而不是列表

  •  2
  • It_is_Chris  · 技术社区  · 6 年前

    我有一个如下所示的数据框:

    df = pd.DataFrame({'ID':[1,1,2,2,3,4],'Name':['John Doe','Jane Doe','John Smith','Jane Smith','Jack Hill','Jill Hill']})
    
        ID  Name
    0   1   John Doe
    1   1   Jane Doe
    2   2   John Smith
    3   2   Jane Smith
    4   3   Jack Hill
    5   4   Jill Hill
    

    然后,我添加了另一个按ID分组的列,并在名称中使用唯一的值:

    df['Multi Name'] = df.groupby('ID')['Name'].transform('unique')
    
        ID  Name    Multi Name
    0   1   John Doe    [John Doe, Jane Doe]
    1   1   Jane Doe    [John Doe, Jane Doe]
    2   2   John Smith  [John Smith, Jane Smith]
    3   2   Jane Smith  [John Smith, Jane Smith]
    4   3   Jack Hill   [Jack Hill]
    5   4   Jill Hill   [Jill Hill]
    

    如何从Multi-Name中删除括号?

    我尝试过:

    df['Multi Name'] = df['Multi Name'].str.strip('[]')
    
    
    ID  Name    Multi Name
    0   1   John Doe    NaN
    1   1   Jane Doe    NaN
    2   2   John Smith  NaN
    3   2   Jane Smith  NaN
    4   3   Jack Hill   NaN
    5   4   Jill Hill   NaN
    

    所需输出:

        ID  Name    Multi Name
    0   1   John Doe    John Doe, Jane Doe
    1   1   Jane Doe    John Doe, Jane Doe
    2   2   John Smith  John Smith, Jane Smith
    3   2   Jane Smith  John Smith, Jane Smith
    4   3   Jack Hill   Jack Hill
    5   4   Jill Hill   Jill Hill
    
    3 回复  |  直到 6 年前
        1
  •  5
  •   piRSquared    6 年前

    transform

    df.join(df.groupby('ID').Name.transform('unique').rename('Multi Name'))
    
       ID        Name                Multi Name
    0   1    John Doe      [John Doe, Jane Doe]
    1   1    Jane Doe      [John Doe, Jane Doe]
    2   2  John Smith  [John Smith, Jane Smith]
    3   2  Jane Smith  [John Smith, Jane Smith]
    4   3   Jack Hill               [Jack Hill]
    5   4   Jill Hill               [Jill Hill]
    

    df.join(df.groupby('ID').Name.transform('unique').str.join(', ').rename('Multi Name'))
    
       ID        Name              Multi Name
    0   1    John Doe      John Doe, Jane Doe
    1   1    Jane Doe      John Doe, Jane Doe
    2   2  John Smith  John Smith, Jane Smith
    3   2  Jane Smith  John Smith, Jane Smith
    4   3   Jack Hill               Jack Hill
    5   4   Jill Hill               Jill Hill
    

    map

    df.join(df.ID.map(df.groupby('ID').Name.unique().str.join(', ')).rename('Multi Name'))
    
       ID        Name              Multi Name
    0   1    John Doe      John Doe, Jane Doe
    1   1    Jane Doe      John Doe, Jane Doe
    2   2  John Smith  John Smith, Jane Smith
    3   2  Jane Smith  John Smith, Jane Smith
    4   3   Jack Hill               Jack Hill
    5   4   Jill Hill               Jill Hill
    

    itertools.groupby

    from itertools import groupby
    
    d = {
        k: ', '.join(x[1] for x in v)
        for k, v in groupby(sorted(set(zip(df.ID, df.Name))), key=lambda x: x[0])
    }
    
    df.join(df.ID.map(d).rename('Multi Name'))
    
       ID        Name              Multi Name
    0   1    John Doe      Jane Doe, John Doe
    1   1    Jane Doe      Jane Doe, John Doe
    2   2  John Smith  Jane Smith, John Smith
    3   2  Jane Smith  Jane Smith, John Smith
    4   3   Jack Hill               Jack Hill
    5   4   Jill Hill               Jill Hill
    
        2
  •  5
  •   cs95 abhishek58g    6 年前

    看起来像 unique 错误的 此处的功能选择。我建议使用 str.join :

    df['Multi Name'] = df.groupby('ID')['Name'].transform(lambda x: ', '.join(set(x)))
    

    df
       ID        Name              Multi Name
    0   1    John Doe      John Doe, Jane Doe
    1   1    Jane Doe      John Doe, Jane Doe
    2   2  John Smith  Jane Smith, John Smith
    3   2  Jane Smith  Jane Smith, John Smith
    4   3   Jack Hill               Jack Hill
    5   4   Jill Hill               Jill Hill
    
        3
  •  3
  •   Scott Boston    6 年前

    使用 map join :

    df['Multi Name'] = df.groupby('ID')['Name'].transform('unique').map(', '.join)
    

    输出:

       ID        Name              Multi Name
    0   1    John Doe      John Doe, Jane Doe
    1   1    Jane Doe      John Doe, Jane Doe
    2   2  John Smith  John Smith, Jane Smith
    3   2  Jane Smith  John Smith, Jane Smith
    4   3   Jack Hill               Jack Hill
    5   4   Jill Hill               Jill Hill