代码之家 › 专栏 › 技术社区 › David Robie

删除以XXX开头并符合条件的列

startswith dplyr select r

2

David Robie · 技术社区 · 1 年前

目标:如果列的名称以XXX开头并且行符合条件,则删除列。例如,在下面的数据集中,删除所有仅包含零的“fillerX”列。

数据:

iris %>% 
    tibble() %>% 
    slice(1:5) %>% 
    mutate(
        fillerQ = rep(0,5),
        fillerW = rep(0,5),
        fillerE = rep(0,5),
        fillerR = c(0,0,1,0,0),
        fillerT = rep(0,5),
        fillerY = rep(0,5),
        fillerU = rep(0,5),
        fillerI = c(0,0,0,0,1),
        fillerO = rep(0,5),
        fillerP = rep(0,5),
    )
# A tibble: 5 Ã 15
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species fillerQ fillerW fillerE fillerR fillerT fillerY fillerU fillerI fillerO fillerP
         <dbl>       <dbl>        <dbl>       <dbl> <fct>     <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1          5.1         3.5          1.4         0.2 setosa        0       0       0       0       0       0       0       0       0       0
2          4.9         3            1.4         0.2 setosa        0       0       0       0       0       0       0       0       0       0
3          4.7         3.2          1.3         0.2 setosa        0       0       0       1       0       0       0       0       0       0
4          4.6         3.1          1.5         0.2 setosa        0       0       0       0       0       0       0       0       0       0
5          5           3.6          1.4         0.2 setosa        0       0       0       0       0       0       0       1       0       0

问题:我们可以使用 starts_with("filler") 为了引用填充栏,我们可以使用 select_if(~ sum(abs(.)) != 0) 保持非零列,但我们不能 starts_with() 内部 select_if() ,因为我们会得到错误:

Error:
! `starts_with()` must be used within a *selecting* function.
â¹ See ?tidyselect::faq-selection-context for details.
Run `rlang::last_trace()` to see where the error occurred.

问:如何组合 starts_with() 和 select_if() ?

2 回复 | 直到 1 年前

1

3

Darren Tsai 1 年前

select_if() 已被取代。使用 where() 里面 select() 相反。

library(dplyr)

df %>%
  select(!(starts_with("filler") & where(~ all(.x == 0))))

# # A tibble: 5 Ã 7
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species fillerR fillerI
#          <dbl>       <dbl>        <dbl>       <dbl> <fct>     <dbl>   <dbl>
# 1          5.1         3.5          1.4         0.2 setosa        0       0
# 2          4.9         3            1.4         0.2 setosa        0       0
# 3          4.7         3.2          1.3         0.2 setosa        1       0
# 4          4.6         3.1          1.5         0.2 setosa        0       0
# 5          5           3.6          1.4         0.2 setosa        0       1

2

0

David Robie 1 年前

答案:把整个桌子都套起来 !starts_with() ,执行 select_if() 具有 is.list() || 作为论证的一部分,然后取消数据。 告诉select语句接受列表是必要的,因为如果 sum() 在嵌套列表上尝试。这看起来像:

nest(data = !starts_with("filler")) %>% 
select_if(~ is.list(.) || sum(abs(.)) != 0) %>% 
relocate(data) %>% 
unnest(data)

这给出了预期的结果:

# A tibble: 5 Ã 7
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species fillerR fillerI
         <dbl>       <dbl>        <dbl>       <dbl> <fct>     <dbl>   <dbl>
1          5.1         3.5          1.4         0.2 setosa        0       0
2          4.9         3            1.4         0.2 setosa        0       0
3          4.6         3.1          1.5         0.2 setosa        0       0
4          4.7         3.2          1.3         0.2 setosa        1       0
5          5           3.6          1.4         0.2 setosa        0       1