代码之家 › 专栏 › 技术社区 › EEPBAH

返回A列的所有值并将其放入B列,直到达到特定值

numpy pandas python

EEPBAH · 技术社区 · 7 年前

    df    max_val = 8

    A
    1
    2
    2
    3 
    4
    5
    1

    df    max_val = 8

    A    B
    1    1
    2    2
    2    2
    3    3
    4    0
    5    0
    1    0

我是这样想的

    def func(x):
        if df['A'].cumsum() <= max_val:
            return x
        else:
          return 0

     df['B'] = df['A'].apply(func, axis =1 )

     df['B'] = func(df['A'])

7 回复 | 直到 7 年前

jezrael 7 年前

Series.where :

df['B'] = df['A'].where(df['A'].cumsum() <= max_val, 0)
print (df)
   A  B
0  1  1
1  2  2
2  2  2
3  3  3
4  4  0
5  5  0
6  1  0

Divakar 7 年前

方法#1 np.where

df['B']= np.where((df.A.cumsum()<=max_val), df.A ,0)

In [145]: df
Out[145]: 
   A  B
0  1  1
1  2  2
2  2  2
3  3  3
4  4  0
5  5  0
6  1  0

方法#2 另一个使用 array-initialization

def app2(df,max_val):
    a = df.A.values
    colB = np.zeros(df.shape[0],dtype=a.dtype)
    idx = np.searchsorted(a.cumsum(),max_val, 'right')
    colB[:idx] = a[:idx]
    df['B'] = colB

运行时测试

看起来像 @jezrael's pd.where based one

In [293]: df = pd.DataFrame({'A':np.random.randint(0,9,(1000000))})

In [294]: max_val = 1000000

# @jezrael's soln
In [295]: %timeit df['B1'] = df['A'].where(df['A'].cumsum() <= max_val, 0)
100 loops, best of 3: 8.22 ms per loop

# Proposed in this post
In [296]: %timeit df['B2']= np.where((df.A.cumsum()<=max_val), df.A ,0)
100 loops, best of 3: 6.45 ms per loop

# Proposed in this post
In [297]: %timeit app2(df, max_val)
100 loops, best of 3: 4.47 ms per loop

BENY 7 年前

df['B']=[x if x<=8 else 0 for x in df['A'].cumsum()]
df
Out[7]: 
   A  B
0  1  1
1  2  3
2  2  5
3  3  8
4  4  0
5  5  0
6  1  0

Nenri 7 年前

为什么不向这样的变量添加值:

for i in range(len(df)):
    if A<max_val:
        return x
    else:
        return 0
    A=A+df[i]

00__00__00 7 年前

import pandas as pd
A=[1,2,2,3,4,5,1]
MAXVAL=8
df=pd.DataFrame(data=A,columns=['A'])
df['cumsumA']=df['A'].cumsum()
df['B']=df['cumsumA']*(df['cumsumA']<MAXVAL).astype(int)

然后可以删除“cumsumA”列

Brij 7 年前

下面的操作很好-

import numpy as np
max_val = 8
df['B'] = np.where(df['A'].cumsum() <= max_val , df['A'],0)

Forrains_459 7 年前

只是一种方法 .loc :

df['c'] = df['a'].cumsum()
df['b'] = df['a']
df['b'].loc[df['c'] > 8] = 0

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

5 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

5 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

5 月前

user29715306 · from_users=和chats=电视节目中的差异

5 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

5 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

5 月前

prayner · 更新嵌套字典包含列表中的项

5 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

5 月前

Dave · 如何在for循环中修改列表值

5 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

5 月前