在我的代码中,它使用
dplyr
,我经常对一个dataframe变量执行某些操作(这里假设简单地乘以2,以简化MRE),还可以选择对另一个变量进行分组,然后
select
只有一些结果变量。为了防止代码重复,我想编写一个函数。
library(ggplot2)
msleep_mini <- msleep[1:10, ]
函数必须再现以下行为。如果是一个单独的论点,比如说,
sleep_total
,它只会成倍增长
,并返回包含列的数据帧
name
,
vore
order
和
:
# test_1
msleep_mini %>%
group_double_select(sleep_total)
#> # A tibble: 20 x 4
#> name vore order sleep_total
#> <chr> <chr> <chr> <dbl>
#> 1 Cheetah carni Carnivora 24.2
#> 2 Owl monkey omni Primates 34
#> 3 Mountain beaver herbi Rodentia 28.8
#> 4 Greater short-tailed shrew omni Soricomorpha 29.8
#> 5 Cow herbi Artiodactyla 8
#> 6 Three-toed sloth herbi Pilosa 28.8
#> 7 Northern fur seal carni Carnivora 17.4
#> 8 Vesper mouse <NA> Rodentia 14
#> 9 Dog carni Carnivora 20.2
#> 10 Roe deer herbi Artiodactyla 6
id
列(包含每个组内的累进行号)将添加到数据帧中。换句话说,输出将是
# test_2
msleep_mini %>%
group_double_select(sleep_total, vore)
#> # A tibble: 20 x 5
#> # Groups: vore [4]
#> vore name order sleep_total id
#> <chr> <chr> <chr> <dbl> <int>
#> 1 carni Cheetah Carnivora 24.2 1
#> 2 carni Northern fur seal Carnivora 17.4 2
#> 3 carni Dog Carnivora 20.2 3
#> 4 carni Long-nosed armadillo Cingulata 34.8 4
#> 5 herbi Mountain beaver Rodentia 28.8 1
#> 6 herbi Cow Artiodactyla 8 2
#> 7 herbi Three-toed sloth Pilosa 28.8 3
#> 8 herbi Roe deer Artiodactyla 6 4
#> 9 herbi Goat Artiodactyla 10.6 5
#> 10 herbi Guinea pig Rodentia 18.8 6
当然,函数必须处理任意变量(只要在数据帧中可以找到它们):
# test_3
msleep_mini %>%
group_double_select(sleep_rem, order)
#> # A tibble: 20 x 5
#> # Groups: order [9]
#> order name vore sleep_rem id
#> <chr> <chr> <chr> <dbl> <int>
#> 1 Artiodactyla Cow herbi 1.4 1
#> 2 Artiodactyla Roe deer herbi NA 2
#> 3 Artiodactyla Goat herbi 1.2 3
#> 4 Carnivora Cheetah carni NA 1
#> 5 Carnivora Northern fur seal carni 2.8 2
#> 6 Carnivora Dog carni 5.8 3
#> 7 Cingulata Long-nosed armadillo carni 6.2 1
#> 8 Didelphimorphia North American Opossum omni 9.8 1
#> 9 Hyracoidea Tree hyrax herbi 1 1
#> 10 Pilosa Three-toed sloth herbi 4.4 1
group_double_select
在一个健壮和可维护的方式是使用整洁的评估,但我可能是错的。你能帮助我吗?