代码之家  ›  专栏  ›  技术社区  ›  John Conor

将随机观测值之和计算为R中每周的总和

  •  0
  • John Conor  · 技术社区  · 3 年前

    我有一个随机的,有时不经常发生的事件的数据集,我想把它们算作每周的总和。由于随机性,它们不是线性的,所以到目前为止我尝试过的其他例子不适用。

    数据与此类似:

    
    df_date <- data.frame( Name = c("Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim",
                                    "Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue"),
                           Dates = c("2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
                                     "2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28",
                                     "2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
                                     "2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28"),
                           Event = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) )
    

    我想做的是创建一个新表,其中包含日历年中每周的事件总数。

    在这种情况下,产生如下内容:

    Name   Week   Events
    Jim    1      3
    Sue    1      3
    Jim    2      0
    Sue    x ...  x 
    
    and so on...
    
    2 回复  |  直到 3 年前
        1
  •  5
  •   TarJae    3 年前

    多年更新OP请求:

    我们可以用 isoweek 也来自 lubridate 而不是 week

    或者:

    我们可以将年份添加如下:

    df_date %>% 
      as_tibble() %>% 
      mutate(Week = week(ymd(Dates))) %>% 
      mutate(Year = year(ymd(Dates))) %>% 
      count(Name, Year, Week)
    

    我们可以用 润滑 s Week 角色转换后的功能 Dates 最新格式 润滑 s ymd 作用 然后我们可以使用 count 哪个是缩写 group_by(Name, Week) %>% summarise(Count = n()) :

    library(dplyr)
    library(lubridate)
    df_date %>% 
      as_tibble() %>% 
      mutate(Week = week(ymd(Dates))) %>% 
      count(Name, Week)
    
      Name   Week     n
       <chr> <dbl> <int>
     1 Jim       1     3
     2 Jim       3     2
     3 Jim       5     1
     4 Jim       6     2
     5 Jim       7     1
     6 Jim       9     1
     7 Sue       1     3
     8 Sue       3     2
     9 Sue       5     1
    10 Sue       6     2
    11 Sue       7     1
    12 Sue       9     1
    
        2
  •  2
  •   langtang    3 年前

    以下是一种方法,可以让您了解每个人的每个ISO周,如果该个人在该周没有活动,则为零:

    get_dates_df <- function(d) {
      data.frame(date = seq(min(d, na.rm=T),max(d,na.rm=T),1)) %>% 
        mutate(Year=year(date), Week=week(date)) %>% 
        distinct(Year, Week)    
    }
    
    df_date = df_date %>% mutate(Dates=lubridate::ymd(Dates))
    
    left_join(
      full_join(distinct(df_date %>% select(Name)), get_dates_df(df_date$Dates), by=character()),
      df_date %>% 
      group_by(Name,Year=year(Dates), Week=week(Dates)) %>% 
      summarize(Events = sum(Event), .groups="drop")
    ) %>% 
      mutate(Events=if_else(is.na(Events),0,Events))
    

    输出:

       Name Year Week Events
    1   Jim 2010    1      3
    2   Jim 2010    2      0
    3   Jim 2010    3      2
    4   Jim 2010    4      0
    5   Jim 2010    5      1
    6   Jim 2010    6      2
    7   Jim 2010    7      1
    8   Jim 2010    8      0
    9   Jim 2010    9      1
    10  Sue 2010    1      3
    11  Sue 2010    2      0
    12  Sue 2010    3      2
    13  Sue 2010    4      0
    14  Sue 2010    5      1
    15  Sue 2010    6      2
    16  Sue 2010    7      1
    17  Sue 2010    8      0
    18  Sue 2010    9      1