代码之家  ›  专栏  ›  技术社区  ›  Grubbmeister

如何利用dplyr获得多样本点的物种丰富度和丰富度

  •  2
  • Grubbmeister  · 技术社区  · 6 年前

    问题:

    我有许多地点,每个地点有10个取样点。

    Site Time Sample Species1 Species2 Species3 etc
    Home    A      1        1        0        4 ...
    Home    A      2        0        0        2 ...
    Work    A      1        0        1        1 ...
    Work    A      2        1        0        1 ...
    Home    B      1        1        0        4 ...
    Home    B      2        0        0        2 ...
    Work    B      1        0        1        1 ...
    Work    B      2        1        0        1 ...
    ...
    

    我想获得每个网站的丰富和丰富。丰富度是一个地点的物种总数,丰富度是一个地点所有物种的所有个体总数,如下所示:

    Site Time Richness Abundance
    Home    A        2         7
    Work    A        3         4
    Home    B        2         7
    Work    B        3         4
    

    7:34

    df1 <- df %>% mutate(Abundance = rowSums(.[,4:30])) %>%
    group_by(Site,Time) %>%   
        summarise_all(sum)
    
    df1$Richness <- apply(df1[,4:30]>0, 1, sum)
    

    如果我尝试在一个函数中同时执行这两个操作,则会出现以下错误

    df1 <- df  %>% mutate(Abundance = rowSums(.[,4:30]) ) %>%
       group_by(Site, Time) %>%   
       summarise_all(sum) %>% 
       mutate(Richness = apply(.[,4:30]>0, 1, sum))
    
    Error in mutate_impl(.data, dots) : 
      Column `Richness` must be length 5 (the group size) or one, not 19
    

    如何使此功能正常工作?

    (注:此问题以前标记为此问题的副本: Manipulating seperated species quantity data into a species abundance matrix

    全部的 跨柱物种(多柱)。 此外,我认为这个问题的答案非常有用——像我这样的生态学家一直在计算丰富度和富足度,我相信他们会喜欢一个专门的问题。)

    1 回复  |  直到 6 年前
        1
  •  2
  •   akrun    6 年前

    之后 summarise ungroup

    library(tidyverse)
    df %>% 
      mutate(Abundance = rowSums(.[4:ncol(.)])) %>% 
      group_by(Site, Time) %>% 
      summarise_all(sum) %>%
      ungroup %>% 
      mutate(Richness = apply(.[4:(ncol(.)-1)] > 0, 1, sum)) %>%
      #or
      #mutate(Richness = rowSums(.[4:(ncol(.)-1)] > 0)) %>%
      select(Site, Time, Abundance, Richness)
    # A tibble: 4 x 4
    #  Site  Time  Abundance Richness
    #  <chr> <chr>     <dbl>    <int>
    #1 Home  A             7        2
    #2 Home  B             7        2
    #3 Work  A             4        3
    #4 Work  B             4        3
    

    也可以先做 group_by sum 然后 transmute

    df %>% 
      group_by(Site, Time) %>%
      summarise_at(vars(matches("Species")), sum)  %>% 
      ungroup %>%
      transmute(Site, Time, Abundance = rowSums(.[3:ncol(.)]), 
                            Richness =  rowSums(.[3:ncol(.)] > 0))
    

    或者另一个选择是 总和 具有 map

    df %>% 
       group_by(Site, Time) %>%
       summarise_at(vars(matches("Species")), sum) %>% 
       group_by(Time, add = TRUE) %>%
       nest %>% 
       mutate(data = map(data, ~ 
                     tibble(Richness = sum(.x > 0), 
                            Abundance = sum(.x)))) %>% 
       unnest
    

    数据

    df <- structure(list(Site = c("Home", "Home", "Work", "Work", "Home", 
    "Home", "Work", "Work"), Time = c("A", "A", "A", "A", "B", "B", 
    "B", "B"), Sample = c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), Species1 = c(1L, 
    0L, 0L, 1L, 1L, 0L, 0L, 1L), Species2 = c(0L, 0L, 1L, 0L, 0L, 
    0L, 1L, 0L), Species3 = c(4L, 2L, 1L, 1L, 4L, 2L, 1L, 1L)), 
    class = "data.frame", row.names = c(NA, 
     -8L))