代码之家  ›  专栏  ›  技术社区  ›  iBeMeltin

从groupby执行计算后创建新的数据帧

  •  1
  • iBeMeltin  · 技术社区  · 1 月前

    我有一个数据帧,看起来像这样:

    ID  TradeDate    party    Deal   Asset  Start       Expire      Fixed   Quantity  MTM      Float
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Buy    HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Buy    HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Buy    WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    

    我按Start、Asset和Deal对数据进行分组,并进行计算以将数据帧转换为:

    groups = df.groupby(['Start', 'Asset', 'Deal'])
    
    ID  TradeDate    party    Deal   Asset  Start       Expire      Fixed   Quantity  MTM      Float
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    total                                                                   3000     7500.00   
    
    ID  TradeDate    party    Deal   Asset  Start       Expire      Fixed   Quantity  MTM      Float
    1   04/11/2024   party1   Buy    HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Buy    HO     01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    total                                                                   3000     5000.00   
    
    ID  TradeDate    party    Deal   Asset  Start       Expire      Fixed   Quantity  MTM      Float
    1   04/11/2024   party1   Sell   WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    1   04/11/2024   party1   Sell   WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    total                                                                   3000     5000.00   
    
    ID  TradeDate    party    Deal   Asset  Start       Expire      Fixed   Quantity  MTM      Float
    1   04/11/2024   party1   Buy    WTI    01/01/2024  02/01/2024  10.00   1000     2500.00   10.00
    total                                                                   1000     2500.00   
    

    我的目标是下次转换这些组,这样我就可以只输出我需要的数据。此步骤的预期输出应该如下所示:

    party   Deal  Asset Start        MTM       Float
    party1  Sell  HO    01/01/2024   7500.00   10.00 
    party1  Buy   HO    01/01/2024   5000.00   10.00
    party1  Sell  WTI   01/01/2024   5000.00   10.00
    party1  Buy   WTi   01/01/2024   2500.00   10.00                                                         
    

    我需要进行另一次某种分组吗?或者还有其他功能可以实现这一点?任何建议都会有所帮助。

    注意:在第二步中,这些是来自列表的单个数据帧。您可能需要一个中间步骤来将这些连接在一起,然后获得最终输出。

    1 回复  |  直到 1 月前
        1
  •  1
  •   Andrej Kesely    1 月前

    我想你只需要 pd.Groupby + .agg :

    out = df.groupby(["Start", "Asset", "Deal"], sort=False, as_index=False).agg(
        {"party": "first", "MTM": "sum", "Float": "first"}
    )
    print(out)
    

    打印:

            Start Asset  Deal   party     MTM  Float
    0  01/01/2024    HO  Sell  party1  7500.0   10.0
    1  01/01/2024    HO   Buy  party1  5000.0   10.0
    2  01/01/2024   WTI  Sell  party1  5000.0   10.0
    3  01/01/2024   WTI   Buy  party1  2500.0   10.0