代码之家 › 专栏 › 技术社区 › SantoshGupta7

如何在Pandas中对一列列表(考虑列表中的每个项)执行条件

pandas python

SantoshGupta7 · 技术社区 · 5 年前

假设我有一列清单。如果列表在一个集合中至少有一个项目,我想保留该行,否则我想删除该行。

下面是一个简单的例子

#create the df 
d={'range':list(range(0,3))}
df=pd.DataFrame(d)
l=[1, 2, 3]
m =[4, 5, 6]
n =[1, 7, 8]
df['var_list']=''
df['var_list'][0]=l
df['var_list'][1]=m
df['var_list'][2]=n
df.head(3)

结果

range   var_list
0   0   [1, 2, 3]
1   1   [4, 5, 6]
2   2   [1, 7, 8]

这就是我想用的那套

setS = {1, 2}

我要做的是,如果任何行的列表中有一个项目在集合中,请保留该行,否则请删除该行。

所以这是我们想要的结果:

range   var_list
0   0   [1, 2, 3]
2   2   [1, 7, 8]

我试过的

df2 = df[df['var_list'].isin(setS)]

这就是我犯的错误

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
TypeError: unhashable type: 'list'

The above exception was the direct cause of the following exception:

SystemError                               Traceback (most recent call last)
<ipython-input-56-90ea3b42ebf3> in <module>()
----> 1 df2 = df[df['var_list'].isin(setS)]

2 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/series.py in isin(self, values)
   4512         Name: animal, dtype: bool
   4513         """
-> 4514         result = algorithms.isin(self, values)
   4515         return self._constructor(result, index=self.index).__finalize__(self)
   4516 

/usr/local/lib/python3.6/dist-packages/pandas/core/algorithms.py in isin(comps, values)
    478             comps = comps.astype(object)
    479 
--> 480     return f(comps, values)
    481 
    482 

/usr/local/lib/python3.6/dist-packages/pandas/core/algorithms.py in <lambda>(x, y)
    454 
    455     # faster for larger cases to use np.in1d
--> 456     f = lambda x, y: htable.ismember_object(x, values)
    457 
    458     # GH16012

pandas/_libs/hashtable_func_helper.pxi in pandas._libs.hashtable.ismember_object()

SystemError: <built-in method view of numpy.ndarray object at 0x7fcc893844e0> returned a result with an error set

3 回复 | 直到 5 年前

Henry Yik 5 年前

列表列不是如何 pandas 正常工作。必须明确检查列表中的项:

print (df[df["var_list"].transform(lambda x: bool(set(x)&sets))])

#
   range   var_list
0      0  [1, 2, 3]
2      2  [1, 7, 8]

Andy L. 5 年前

使用python set intersection创建掩码和切片的列表理解

m = [len(setS & x) > 0 for x in df.var_list.map(set)]
df[m]

Out[21]:
   range   var_list
0      0  [1, 2, 3]
2      2  [1, 7, 8]

oppressionslayer 5 年前

通过将列表转换为集合并进行比较,可以使用apply map and or运算符完成此操作:

[df.var_list.apply(lambda x: False if len(setS | set(x)) > 4 else True)]                                                                                                          
Out[3343]: 
   range   var_list
0      0  [1, 2, 3]
2      2  [1, 7, 8]

推荐文章

Google User · Django管理员在`list_display中未显示`creation_date`字段`

6 月前

user29747013 · 如何创建一个新的数据框架,其中包含原始数据框架中列的聚合列?

6 月前

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

6 月前

user29715306 · from_users=和chats=电视节目中的差异

6 月前

Redshoe · 当执行numpy.genfromtxt()时,python是否会读取文件的所有行?

6 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

6 月前

prayner · 更新嵌套字典包含列表中的项

6 月前

Bringo Jr · 我可以在O(n)中解决这个问题吗?

6 月前

Dave · 如何在for循环中修改列表值

6 月前

Shukurullox Komiljonov · 从记录中获得相互和解。使用SQL

6 月前