代码之家  ›  专栏  ›  技术社区  ›  Binyamin Even

熊猫-基于值复制行

  •  1
  • Binyamin Even  · 技术社区  · 7 年前

    我有一个 dataframe ,其中我要复制行N次,其中N是另一行中的值。

    例如,如果这是我的数据帧:

       Company  WEEK_DAYS   SEPTEMBER 15    SEPTEMBER 22   SEPTEMBER 29    value      
    0  google    MON-FRI         0                0              5          0.5
    1  google       TUE          3                2              0          0.7
    

    因此,我想根据其值复制的列有 SEPTEMBER 15 , SEPTEMBER 22 SEPTEMBER 29 ,值应为 value

    所以最终输出应该是这样的:

         Company  WEEK_DAYS     WEEK       value
    0    google    MON-FRI    SEPTEMBER 29  0.5
    1    google    MON-FRI    SEPTEMBER 29  0.5
    2    google    MON-FRI    SEPTEMBER 29  0.5
    3    google    MON-FRI    SEPTEMBER 29  0.5
    4    google    MON-FRI    SEPTEMBER 29  0.5   
    5    google      TUE      SEPTEMBER 15  0.7 
    6    google      TUE      SEPTEMBER 15  0.7 
    7    google      TUE      SEPTEMBER 15  0.7
    8    google      TUE      SEPTEMBER 22  0.7 
    9    google      TUE      SEPTEMBER 22  0.7  
    

    我试过使用 stack pivot -但我没有达到预期的输出。

    任何帮助都将不胜感激!

    2 回复  |  直到 7 年前
        1
  •  3
  •   jezrael    7 年前

    您可以使用:


    s = df.set_index(['Company','WEEK_DAYS','value']).stack()
    df = (s.loc[s.index.repeat(s)]
           .reset_index()
           .drop(0, axis=1)
           .rename(columns={'level_3':'WEEK'})
           .reindex(columns=['Company','WEEK_DAYS','WEEK','value'])
          )
    print (df)
      Company WEEK_DAYS          WEEK  value
    0  google   MON-FRI  SEPTEMBER 29    0.5
    1  google   MON-FRI  SEPTEMBER 29    0.5
    2  google   MON-FRI  SEPTEMBER 29    0.5
    3  google   MON-FRI  SEPTEMBER 29    0.5
    4  google   MON-FRI  SEPTEMBER 29    0.5
    5  google       TUE  SEPTEMBER 15    0.7
    6  google       TUE  SEPTEMBER 15    0.7
    7  google       TUE  SEPTEMBER 15    0.7
    8  google       TUE  SEPTEMBER 22    0.7
    9  google       TUE  SEPTEMBER 22    0.7
    
        2
  •  0
  •   CermakM    7 年前

    你可以利用 pandas.DataFrame.from_records 作用

    我建议您在新的数据框架中创建一个值列表,即

    records = [('google', 'MON-FRI', 'SEPTEMBER 29', '0.5')]
    # assumes only one record for MON-FRI .. you might need to do some handling here otherwise
    records *= int(df['SEPTEMBER 29'][df[WEEK_DAYS] == 'MON-FRI'])
    labels = ['Company', 'WEEK_DAYS', 'WEEK', 'value']
    
    new_df = pd.DataFrame.from_records(records, columns=labels)
    

    我想这不是有史以来最优雅的解决方案,但应该会奏效。