代码之家  ›  专栏  ›  技术社区  ›  C_psy

根据python中的索引位置转换值

  •  1
  • C_psy  · 技术社区  · 7 年前

    我试过类似的方法,但我知道语法不正确

    df2[df1['ID'].iloc[0]] = 0
    

    ID   Name X
    6539 CM   20
    9999 FM   30
    

    DF2:

    Out 1  Out 2  Out 3  Out 4  Out 5
    7000   8000   6539   6539   6539
    

    Out 1  Out 2  Out 3  Out 4  Out 5
    7000   8000   0      0      0
    
    2 回复  |  直到 7 年前
        1
  •  2
  •   Anton vBR    7 年前

    我认为你需要:

    df2.replace(df.iloc[0,0], 0)
    

    将熊猫作为PD导入

    df1 = pd.DataFrame({
        'a': [1,2],
        'b': [3,4]
    })
    
    df2 = pd.DataFrame({
        'a': [1,1],
        'b': [1,1]
    })
    
    df2.replace(df.iloc[0,0],0)
    

       a  b
    0  0  0
    1  0  0
    

    df1 = pd.concat([df1]*10000)
    df2 = pd.concat([df2]*10000)
    
    %timeit df2.replace(df.iloc[0,0],0, inplace=True)
    %timeit df2[df2 == df1.iloc[0, 0]] = 0
    

    341 µs ± 9.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    1.24 ms ± 39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
        2
  •  2
  •   jezrael    7 年前

    df = df2.mask(df2 == df1.iloc[0,0], 0)
    

    df2[df2 == df1.iloc[0, 0]] = 0
    

    或者:

    df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
    

    print (df)
       Out 1  Out 2  Out 3  Out 4  Out 5
    0   7000   8000      0      0      0
    

    print (df2 == df1.iloc[0,0])
       Out 1  Out 2  Out 3  Out 4  Out 5
    0  False  False   True   True   True
    

    计时

    np.random.seed(100)
    
    df1 = pd.DataFrame({
        'a': [1,2],
        'b': [3,4]
    })
    
    
    df2 = pd.DataFrame(np.random.randint(10, size=(1000,1000)))
    
    
    In [106]: %timeit df2.replace(df1.iloc[0,0],0, inplace=True)
    2.99 ms ± 324 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    In [107]: %timeit df2.mask(df2 == df1.iloc[0,0], 0)
    22.8 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [108]: %timeit df2[df2 == df1.iloc[0, 0]] = 0
    19.6 ms ± 497 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [109]: %timeit df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
    5.81 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    

    np.random.seed(100)
    
    df1 = pd.DataFrame({
        'a': [1,2],
        'b': [3,4]
    })
    
    df2 = pd.DataFrame(np.random.randint(5, size=(1000,10)))
    
    In [116]: %timeit df2.replace(df1.iloc[0,0],0, inplace=True)
    856 µs ± 12.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [117]: %timeit df2.mask(df2 == df1.iloc[0,0], 0)
    1.23 ms ± 25.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [118]: %timeit df2[df2 == df1.iloc[0, 0]] = 0
    1.21 ms ± 4.26 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [119]: %timeit df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
    445 µs ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)