代码之家  ›  专栏  ›  技术社区  ›  plover

tbl_summary(),但按组进行分层,使用长而整洁的数据

  •  0
  • plover  · 技术社区  · 2 年前

    这是问题的延伸 here (不幸的是,我没有资格发表评论)。

    我也试图按组进行分层,但百分比并没有计算每组中的N,而是显示所有组的100%。

    示例数据:

    tbl <- 
      df <- tibble(ID = c("A","A", "A", "B", "B", "B", "C", "C", "C", "D", "E"), 
           fruit= c("apples", "bananas", "cherries", "cherries", "dill", "eclairs","figs" ,"apples", "cherries", "figs", "cherries"), 
           ID_discount = c("senior","senior", "senior","none", "none", "none", "senior","senior", "senior", "none" , "none"))
    
    df$present <- TRUE
    

    这是前面的问题。它为添加了两组 ID_discount 在底部,因此它适用于总计数:

    df |>  complete(ID, fruit, fill = list(present = FALSE)) %>%
      select(-ID) %>% 
      # summarizing data
      tbl_summary(by = present, percent = "row") %>%
      modify_header(stat_2 ~ "**Overall**") %>%
      modify_column_hide(stat_1)
    

    但当我对小组进行分层时,它不起作用,

    df |>  complete(ID, fruit, fill = list(present = FALSE)) %>%
      select(-ID) %>% 
      tbl_strata(
         strata = ID_discount, 
         ~ .x |>
      # summarizing data
      tbl_summary(by = present, percent = "row") ) 
      #modify_header(stat_2 ~ "**Overall**") %>%
      #modify_column_hide(stat_1)
    

    输出如下所示:

    Characteristic  none            senior          NA
                    TRUE, N = 5     TRUE, N = 6     FALSE, N = 19
    fruit           
        cherries    2 (100%)        2 (100%)        1 (100%)
        dill        1 (100%)                        4 (100%)
    

    我希望它在下面给出“无”和“高级”列的输出。我可以手动删除NA列,以及TRUE和FALSE行,但实际的N应该是2个人在 senior 组中的3个 none 组我的实际数据集太大,无法制作成宽格式。这里的N是每组中的人数(N1为无,N2为老年人,N1+N2=观察到的总人数)

    Characteristic  none            senior          
                    N = 3           N = 2   
    fruit           
        cherries    2 (66%)         2 (100%)        
        dill        1 (33%)                                         
    

    谢谢

    1 回复  |  直到 2 年前
        1
  •  1
  •   Friede    2 年前

    我想你想要

    library(tidyverse)
    library(gtsummary)
    df |> complete(ID, fruit, fill = list(present = FALSE)) |>
      select(-ID) |>
      na.omit() |>
      tbl_strata(
        strata = ID_discount, ~ .x |>
          tbl_summary(by = present, percent = "column")) 
    

    它给出

    enter image description here

    注意,我看到的是$11$行,而不是$3$行:

    > df |> complete(ID, fruit, fill = list(present = FALSE)) %>%
    +   select(-ID) |>
    +   group_by(present) |>
    +   na.omit()
    # A tibble: 11 × 3
    # Groups:   present [1]
       fruit    ID_discount present
       <chr>    <chr>       <lgl>  
     1 apples   senior      TRUE   
     2 bananas  senior      TRUE   
     3 cherries senior      TRUE   
     4 cherries none        TRUE   
     5 dill     none        TRUE   
     6 eclairs  none        TRUE   
     7 apples   senior      TRUE   
     8 cherries senior      TRUE   
     9 figs     senior      TRUE   
    10 figs     none        TRUE   
    11 cherries none        TRUE 
    

    数据:

    df <- tibble(ID = c("A","A", "A", "B", "B", "B", "C", "C", "C", "D", "E"), 
                 fruit= c("apples", "bananas", "cherries", "cherries", "dill", 
                          "eclairs","figs" ,"apples", "cherries", "figs", "cherries"), 
                 ID_discount = c("senior","senior", "senior","none", "none", "none", 
                                 "senior","senior", "senior", "none" , "none"))