代码之家  ›  专栏  ›  技术社区  ›  Amin Jebeli

两个数据帧中的条件替换

  •  0
  • Amin Jebeli  · 技术社区  · 3 年前

    我有两个包含数千行的数据帧,它们类似于以下两个数据帧:

    enter image description here enter image description here

    我希望将第一个数据帧中的目标列的值移动到第二个数据帧,只要第一个和第二个数据库中的活动名称相同。我指的是以下数据帧:

    enter image description here

    1 回复  |  直到 3 年前
        1
  •  -1
  •   Sreeram TP    3 年前

    您必须在左侧使用df_1进行左连接,然后使用df_1中现有的目标列来填充连接产生的空值。

    df_1 = pd.DataFrame()
    df_2 = pd.DataFrame()
    
    df_1['campaign'] = ['a', 'b', 'c', 'd']
    df_1['goal'] =['order', 'order', 'off', 'order']
    
    df_2['campaign'] = ['a', 'b', 'c']
    df_2['goal'] = ['Subscription', 'order', 'Subscription']
    
    # left join
    df = df_1.merge(df_2.rename(columns={'goal': 'new_goal'}), on=['campaign'], how='left')
    # replace nulls 
    df['new_goal'].fillna(df['goal'], inplace=True)
    
    df
    
    +---+----------+-------+--------------+
    |   | campaign | goal  |   new_goal   |
    +---+----------+-------+--------------+
    | 0 |    a     | order | Subscription |
    | 1 |    b     | order |    order     |
    | 2 |    c     |  off  | Subscription |
    | 3 |    d     | order |    order     |
    +---+----------+-------+--------------+
    

    您可以选择所需的列,并根据需要重命名它们

    df_final = df[['campaign', 'new_goal']].rename(columns={'new_goal': 'goal'})
    
        2
  •  -1
  •   RSale    3 年前

    这将覆盖df1中的值

    import pandas as pd
    
    df1 = pd.DataFrame({'campaign':['a','b','c','d'],'goal':['order','order','off','order',]})
    df2 = pd.DataFrame({'campaign':['a','b','c','d'],'goal':['Subscription','order','Subscription','order',]})
    
    df2.merge(df1, how= 'left')
    
    >>  campaign    goal
        0   a   Subscription
        1   b   order
        2   c   Subscription
        3   d   order
    
        3
  •  -1
  •   Gabriele    3 年前

    你可以使用pandas。DataFrame.merge为此:

    https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html

    df1 = pd.DataFrame({'campaign': ['a','b','c','d'], 'goal': ['order','order','off','order']})
    df2 = pd.DataFrame({'campaign': ['a','b','c'], 'goal': ['subscription','order','subscription']})
    
    df_out=pd.merge(df2,df1,on='campaign',how='left',suffixes=('_df2','_df1'))
    

    结果:

      campaign      goal_df2 goal_df1
    0        a  subscription    order
    1        b         order    order
    2        c  subscription      off