代码之家 › 专栏 › 技术社区 › rahlf23

使用值列表(或系列)更新多索引数据帧

multi-index dataframe pandas python

rahlf23 · 技术社区 · 7 年前

我希望能够使用一个单独的函数的输出来更新多索引数据文件中的值,该函数在另一个现有的数据文件上执行计算。

比如说,我有以下几点:

import numpy as np, pandas as pd

names = ['Johnson','Jackson','Smith']
attributes = ['x1','x2','x3','x4','x5']
categories = ['y1','y2','y3','y4','y5','y6']

index = pd.MultiIndex.from_product([names, attributes])
placeholders = np.zeros((len(names)*len(attributes), len(categories)), dtype=int)

df = pd.DataFrame(placeholders, index=index, columns=categories)

生成相应的数据帧:

            y1  y2  y3  y4  y5  y6
Johnson x1   0   0   0   0   0   0
        x2   0   0   0   0   0   0
        x3   0   0   0   0   0   0
        x4   0   0   0   0   0   0
        x5   0   0   0   0   0   0
Jackson x1   0   0   0   0   0   0
        x2   0   0   0   0   0   0
        x3   0   0   0   0   0   0
        x4   0   0   0   0   0   0
        x5   0   0   0   0   0   0
Smith   x1   0   0   0   0   0   0
        x2   0   0   0   0   0   0
        x3   0   0   0   0   0   0
        x4   0   0   0   0   0   0
        x5   0   0   0   0   0   0

现在,我有了另一个函数,它生成一系列值,然后用来更新这个数据帧。例如:

x1 = pd.Series([2274, 556, 1718, 1171, 183, 194], index=categories)
x2 = pd.Series([627, 154, 473, 215, 68, 77], index=categories)

如何更新的序列值 ('Johnson','x1') ?

向量 x1 和 x2 通过调用两个嵌套for循环内的函数生成。我似乎不知道如何更新数据帧,这些值都是零:

for i in names:
    for j in attributes:
        x1 = generate_data_list('x1')
        df.loc[i,j].update(x1)

感谢您的帮助!

2 回复 | 直到 7 年前

akuiper 7 年前

只分配 x1 到 df.loc[i, j] :

df.loc['Johnson', 'x1'] = x1

或:

df.loc[('Johnson', 'x1')] = x1

df
#              y1   y2    y3    y4   y5   y6
#Johnson x1  2274  556  1718  1171  183  194
#        x2     0    0     0     0    0    0
#        x3     0    0     0     0    0    0
#        x4     0    0     0     0    0    0
#        x5     0    0     0     0    0    0
#Jackson x1     0    0     0     0    0    0
#        x2     0    0     0     0    0    0
#        x3     0    0     0     0    0    0
#        x4     0    0     0     0    0    0
#        x5     0    0     0     0    0    0
#Smith   x1     0    0     0     0    0    0
#        x2     0    0     0     0    0    0
#        x3     0    0     0     0    0    0
#        x4     0    0     0     0    0    0
#        x5     0    0     0     0    0    0

BENY 7 年前

您可以创建正确格式的信息,然后使用 update

x1 = pd.DataFrame(data=[[2274, 556, 1718, 1171, 183, 194]], index=pd.MultiIndex.from_arrays([['Johnson'],['x1']]),columns=categories)
x1
              y1   y2    y3    y4   y5   y6
Johnson x1  2274  556  1718  1171  183  194
df.update(x1)
df
                y1     y2      y3      y4     y5     y6
Johnson x1  2274.0  556.0  1718.0  1171.0  183.0  194.0
        x2     0.0    0.0     0.0     0.0    0.0    0.0
        x3     0.0    0.0     0.0     0.0    0.0    0.0
        x4     0.0    0.0     0.0     0.0    0.0    0.0
        x5     0.0    0.0     0.0     0.0    0.0    0.0
Jackson x1     0.0    0.0     0.0     0.0    0.0    0.0
        x2     0.0    0.0     0.0     0.0    0.0    0.0
        x3     0.0    0.0     0.0     0.0    0.0    0.0
        x4     0.0    0.0     0.0     0.0    0.0    0.0
        x5     0.0    0.0     0.0     0.0    0.0    0.0
Smith   x1     0.0    0.0     0.0     0.0    0.0    0.0
        x2     0.0    0.0     0.0     0.0    0.0    0.0
        x3     0.0    0.0     0.0     0.0    0.0    0.0
        x4     0.0    0.0     0.0     0.0    0.0    0.0
        x5     0.0    0.0     0.0     0.0    0.0    0.0