代码之家  ›  专栏  ›  技术社区  ›  Pyd

在python中使用pandas将关键字映射到dataframe列

  •  1
  • Pyd  · 技术社区  · 7 年前

    我有一个数据帧,

    DF,
    Name    Stage   Description
    Sri     1       Sri is one of the good singer in this two
            2       Thanks for reading
    Ram     1       Ram is one of the good cricket player
    ganesh  1       good driver
    

    和一份清单,

    my_list=["one"]
    
     I tried mask=df["Description"].str.contains('|'.join(my_list),na=False)
    

    但它给了,

     output_DF.
    Name    Stage   Description
    Sri     1       Sri is one of the good singer in this two
    Ram     1       Ram is one of the good cricket player
    
    My desired output is,
    desired_DF,
    Name    Stage   Description
    Sri     1       Sri is one of the good singer in this two
            2       Thanks for reading
    Ram     1       Ram is one of the good cricket player
    

    它必须考虑stage列,我想要与描述相关联的所有行。

    2 回复  |  直到 6 年前
        1
  •  1
  •   jezrael    7 年前

    我认为您需要:

    print (df)
         Name  Stage                                Description
    0     Sri      1  Sri is one of the good singer in this two
    1              2                         Thanks for reading
    2     Ram      1      Ram is one of the good cricket player
    3  ganesh      1                                good driver
    
    #replace empty or whitespaces by previous value
    df['Name'] = df['Name'].mask(df['Name'].str.strip() == '').ffill()
    print (df)
         Name  Stage                                Description
    0     Sri      1  Sri is one of the good singer in this two
    1     Sri      2                         Thanks for reading
    2     Ram      1      Ram is one of the good cricket player
    3  ganesh      1                                good driver
    
    #get all names by condition
    my_list = ["one"]
    names=df.loc[df["Description"].str.contains("|".join(my_list),na=False), 'Name']
    print (names)
    0    Sri
    2    Ram
    Name: Name, dtype: object
    
    #select all rows contains names
    df = df[df['Name'].isin(names)]
    print (df)
      Name  Stage                                Description
    0  Sri      1  Sri is one of the good singer in this two
    1  Sri      2                         Thanks for reading
    2  Ram      1      Ram is one of the good cricket player
    
        2
  •  0
  •   Calvin Taylor    7 年前

    它看起来是在数据帧的描述字段中找到“一”,并返回匹配的描述。

    如果需要第三行,则必须为第二个匹配添加数组元素

    例如,“谢谢”,比如我的清单=[one”,“Thanks”]