代码之家  ›  专栏  ›  技术社区  ›  tandem

pandas dataframe-具有选定开始字符串和指定列的列

  •  1
  • tandem  · 技术社区  · 6 年前

    在下面给定的数据帧中,我想清理计数器(例如MEM\u TRANS\u RETIRED)并将列添加到数据帧中 load,rps,th95

           load        rps     th95    energy     ...       MEM_TRANS_RETIRED-34  PERF_COUNT_HW_CACHE_L1D-34  PERF_COUNT_HW_CACHE_L1I-34     map_freq
    0    500.0k  346222.62    12.62   7270.22     ...                  154287.14                   591053.74                2.918521e+07   6C-1.70GHz
    1    400.0k  402628.34     2.25  12026.40     ...                  189915.07                   627043.91                2.867945e+07  10C-2.10GHz
    2    500.0k  283508.27    15.52   5662.74     ...                  140790.31                  1431892.98                4.253950e+07   6C-1.30GHz
    

    我就是这样做的

    self.unique_counters = [x[:-2] for x in self.dfile_keys[6:] if x.endswith('-0')]
    for counter in self.unique_counters:
     new = self.dfile.loc[:, self.dfile.columns.str.startswith(counter)]
    

         PERF_COUNT_HW_CPU_CYCLES-0  PERF_COUNT_HW_CPU_CYCLES-2             ...               PERF_COUNT_HW_CPU_CYCLES-32  PERF_COUNT_HW_CPU_CYCLES-34
    0                  6.020913e+08                6.021277e+08             ...                              5.109342e+06                 2.556039e+06
    1                  4.781879e+08                4.783621e+08             ...                              3.095814e+06                 2.795868e+06
    2                  4.841784e+08                4.844846e+08             ...                              2.389396e+06                 5.550159e+06
    

    如何获得具有选定开始字符串和某些指定列的计数器 这是预期的输出

           load        rps     th95    energy   PERF_COUNT_HW_CPU_CYCLES-0  PERF_COUNT_HW_CPU_CYCLES-2             ...               PERF_COUNT_HW_CPU_CYCLES-32  PERF_COUNT_HW_CPU_CYCLES-34
    0    500.0k  346222.62    12.62   7270.22   6.020913e+08                6.021277e+08             ...                              5.109342e+06                 2.556039e+06
    1    400.0k  402628.34     2.25  12026.40   4.781879e+08                4.783621e+08             ...                              3.095814e+06                 2.795868e+06
    2    500.0k  283508.27    15.52   5662.74   4.841784e+08                4.844846e+08             ...                              2.389396e+06                 5.550159e+06
    
    1 回复  |  直到 6 年前
        1
  •  1
  •   jezrael    6 年前

    我相信你需要一个新的 DataFrame :

    L = [x[:-2] for x in self.dfile_keys[6:] if x.endswith('-0')]
    
    new = self.dfile.loc[:, self.dfile.columns.str.startswith(tuple(L))]
    
    df = pd.concat([self.dfile[['load','rps','th95','energy']], new], axis=1)
    

    list of DataFrame 学生:

    self.unique_counters = [x[:-2] for x in self.dfile_keys[6:] if x.endswith('-0')]
    dfs = [self.dfile.loc[:, self.dfile.columns.str.startswith(counter)] 
           for counter in self.unique_counters]
    
    df = pd.concat([self.dfile[['load','rps','th95','energy']], dfs], axis=1)