代码之家 › 专栏 › 技术社区 › user288609

关于计算数据帧中每个组的摘要统计信息

numpy pandas python-3.x

user288609 · 技术社区 · 6 年前

我有一个数据帧,其中包含如下列

ID                               Time                          Price
1002                     1998-01-02                    34
2001                     1998-02-03                   45
1002                     1998-04-05                    23
2003                      1998-02-03                   30
1002                       1998-02-03                   60

基于这个数据框,我想创建另一个数据框,它有三列, ID , period-1 , period-2 . 每个条目是相应时间段内ID的平均值)

ID                period-1(1998-01-01:1998-02-01)             period 2(1998-02-02-1998-05-02)
1002     
2001
2003

这是我按照建议得到的代码,但有一些错误

import pandas as pd

df=pd.DataFrame({"ID": ["1002", "2001", "1002", "2003", "1002"],
                "Time": ["1998-01-02", "1998-02-03", "1998-04-05", "1998-02-03", "1998-02-03"],
                 "Price": ["34", "45", "23", "30","60"]})


df.Time=pd.to_datetime(df.Time)
period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')
df['Price'].apply(pd.to_numeric)
df['New']='period1'


df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'


df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')


 306             # people may try to aggregate on a non-callable attribute

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in mean(self, *args, **kwargs)
   1304         nv.validate_groupby_func('mean', args, kwargs, ['numeric_only'])
   1305         try:
-> 1306             return self._cython_agg_general('mean', **kwargs)
   1307         except GroupByError:
   1308             raise

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_general(self, how, alt, numeric_only, min_count)
   3972                             min_count=-1):
   3973         new_items, new_blocks = self._cython_agg_blocks(
-> 3974             how, alt=alt, numeric_only=numeric_only, min_count=min_count)
   3975         return self._wrap_agged_blocks(new_items, new_blocks)
   3976 

~\Anaconda3\lib\site-packages\pandas\core\groupby\groupby.py in _cython_agg_blocks(self, how, alt, numeric_only, min_count)
   4044 
   4045         if len(new_blocks) == 0:
-> 4046             raise DataError('No numeric types to aggregate')
   4047 
   4048         # reset the locs in the blocks to correspond to our

DataError: No numeric types to aggregate

1 回复 | 直到 6 年前

BENY 6 年前

通过使用 Interval pivot_table

#df.Time=pd.to_datetime(df.Time)

period2=pd.Interval(pd.Timestamp('1998-02-02'), pd.Timestamp('1998-05-02'), closed='both')


df['New']='period1'

df.loc[df.Time.apply(lambda x : x in period2),'New']='period2'

df.pivot_table(index='ID',columns='New',values='Price',aggfunc='mean')
Out[881]: 
New   period1  period2
ID                    
1002     34.0     41.5
2001      NaN     45.0
2003      NaN     30.0

推荐文章

ÎÎÎ½Î· ÎÎ®Î¹Î½Î¿Ï · Python lxml.html语法错误:使用lxml find时XPATH的谓词无效

4 月前

Cam · Pandas列表日期到日期时间

4 月前

RASEL MAHMUD · 为什么以及如何在is_even()函数内的IF条件中递归X变量在满足0后递增?[副本]

4 月前

jjkennedy · Pandas文本文件导入:当每个文件中存在多个表时,自动选择1个表

4 月前

LMC · Numpy数组布尔索引以获取包含元素

4 月前

vr8ce · 非成对标记中特定字符的正则表达式

5 月前

Kernel · 如果指定了crs参数,shapefile的geopandas.read_file将出错

5 月前

ShaAnder · 为什么sqllachemy返回的是类而不是字符串

5 月前

sixtytrees · detectron2软件包未安装(没有名为“torch”的模块),但我安装了torch

5 月前

Pernoctador · Python映射可以复制吗?我需要参考地图

5 月前