代码之家 › 专栏 › 技术社区 › ah bon

通过使用一列作为前缀,将long dataframe重塑为wide,并重命名新列

tidyverse data.table dplyr dataframe r

ah bon · 技术社区 · 4 年前

给定一个数据帧 df 详情如下:

df <- structure(list(code = c("M0000273", "M0000357", "M0000545", "M0000273", 
"M0000357", "M0000545"), name = c("industry", "agriculture", 
"service", "industry", "agriculture", "service"), act_value = c(16.78, 
9.26, 49.38, 35.74, 88.42, 68.26), pred_value = c(17.78, 10.26, 
50.38, 36.74, 89.42, 69.26), year = c(2019L, 2019L, 2019L, 2020L, 
2020L, 2020L)), class = "data.frame", row.names = c(NA, -6L))

      code        name act_value pred_value year
1 M0000273    industry     16.78      17.78 2019
2 M0000357 agriculture      9.26      10.26 2019
3 M0000545     service     49.38      50.38 2019
4 M0000273    industry     35.74      36.74 2020
5 M0000357 agriculture     88.42      89.42 2020
6 M0000545     service     68.26      69.26 2020

我想用 code 和 name act_value 和 pred_value 从长到宽,最后通过粘贴重命名新列 year 列作为前缀。

      code        name  2019_act_value  2019_pred_value  2020_act_value  2020_pred_value
1 M0000273    industry           16.78            17.78           35.74            36.74
2 M0000357 agriculture            9.26            10.26           88.42            89.42
3 M0000545     service           49.38            50.38           68.26            69.26

我的试用代码:

reshape(df, idvar = c('code', 'name'), timevar = 'year', direction = 'wide')

如何使用R正确地实现这一点?谢谢

2 回复 | 直到 4 年前

caldwellst 4 年前

我们可以使用 tidyr::pivot_wider 这样做。我不推荐你的命名惯例,如果你放弃 names_glue 我们得到了相同的结果,但使用了更整洁的年份作为后缀格式。

library(tidyr)

pivot_wider(df,
            names_from = year,
            names_glue = "{year}_{.value}",
            values_from = ends_with("value"))
#> # A tibble: 3 Ã 6
#>   code     name        `2019_act_value` `2020_act_value` `2019_pred_value`
#>   <chr>    <chr>                  <dbl>            <dbl>             <dbl>
#> 1 M0000273 industry               16.8              35.7              17.8
#> 2 M0000357 agriculture             9.26             88.4              10.3
#> 3 M0000545 service                49.4              68.3              50.4
#> # â¦ with 1 more variable: 2020_pred_value <dbl>

TarJae 4 年前

使用时也略有不同 name_glue 按索引选择:

library(tidyr)

pivot_wider(df, names_from = year, values_from = c(3:4), names_glue = "{year}_{.value}")

  code     name        `2019_act_value` `2020_act_value` `2019_pred_value` `2020_pred_value`
  <chr>    <chr>                  <dbl>            <dbl>             <dbl>             <dbl>
1 M0000273 industry               16.8              35.7              17.8              36.7
2 M0000357 agriculture             9.26             88.4              10.3              89.4
3 M0000545 service                49.4              68.3              50.4              69.3