代码之家  ›  专栏  ›  技术社区  ›  Pyd

如何根据dataframe中的条件移动列中的值

  •  2
  • Pyd  · 技术社区  · 7 年前

    嗨我有一个这样的df,

        Name    sl no   details                 score
    0   Ram     1       ram is going to ooty    NaN
    1   Ram     2       ram sings well          1.5
    2   Ravi    1       ravi play cricket       1.0
    3   Ravi    2       ravi is in chennai      NaN
    4   Kumar   1       kumar passed the exam   NaN
    5   Kumar   2       kumar is in town        NaN
    6   Kumar   3       he left                 3.0
    

    我试图改变分数列中的值。值应移动到 df[sl no]==1 or df[Name] is the first occurence of a name

    我的预期产出应该是,

        Name    sl no   details                 score
    0   Ram     1       ram is going to ooty    1.5
    1   Ram     2       ram sings well          NaN
    2   Ravi    1       ravi play cricket       1.0
    3   Ravi    2       ravi is in chennai      NaN
    4   Kumar   1       kumar passed the exam   3.0
    5   Kumar   2       kumar is in town        NaN
    6   Kumar   3       he left                 NaN
    

    请帮忙。

    2 回复  |  直到 7 年前
        1
  •  1
  •   cs95 abhishek58g    7 年前

    next 在列表中

    有条件调用 下一个 理解列表中的迭代器。

    assert df['sl no'].eq(1).sum() == df['score'].notna().sum()
    
    it = iter(df.score.dropna().tolist())
    df['score'] = [
        next(it) if i else np.nan for i in df['sl no'].eq(1)
    ]
    

    df
        Name  sl no                details  score
    0    Ram      1   ram is going to ooty    1.5
    1    Ram      2         ram sings well    NaN
    2   Ravi      1      ravi play cricket    1.0
    3   Ravi      2     ravi is in chennai    NaN
    4  Kumar      1  kumar passed the exam    3.0
    5  Kumar      2       kumar is in town    NaN
    6  Kumar      3                he left    NaN
    

    如果你的 assert 声明失败,你的数据有问题,你的要求不可行。


    loc -基于任务的分配

    v = df.score.dropna().tolist()
    
    df['score'] = np.nan
    df.loc[df['sl no'].eq(1), 'score'] = v
    

    df
    姓名sl无详细信息评分
    0内存1内存将达到1.5
    1公羊2公羊唱得很好
    2拉维1拉维打板球1.0
    3拉维2拉维在金奈南
    4库马尔1库马尔通过考试3.0
    5库马尔2库马尔在南城
    6库马尔3他离开了南
    
        2
  •  1
  •   Joe    7 年前

    你可以这样做:

    df['score'] = (df['score'].replace('',np.nan).groupby(df['Name']).transform(lambda x: x.bfill().ffill()))
    df.loc[df['sl no'] != 1, 'score'] = np.NaN
    

    先填一栏 score 使用相同的值:

        Name  sl no               details  score
    0    Ram   1     ram is going to ooty    1.5
    1    Ram   2           ram sings well    1.5
    2   Ravi   1        ravi play cricket    1.0
    3   Ravi   2       ravi is in chennai    1.0
    4  Kumar   1    kumar passed the exam    3.0
    5  Kumar   2         kumar is in town    3.0
    6  Kumar   3                  he left    3.0
    

    然后移除柱的位置 sl no 不是1

        Name  sl no              details  score
    0    Ram   1    ram is going to ooty    1.5
    1    Ram   2          ram sings well    NaN
    2   Ravi   1       ravi play cricket    1.0
    3   Ravi   2      ravi is in chennai    NaN
    4  Kumar   1   kumar passed the exam    3.0
    5  Kumar   2        kumar is in town    NaN
    6  Kumar   3                 he left    NaN