代码之家  ›  专栏  ›  技术社区  ›  Franky

lubridate-选择每周的第一个非星期一。

  •  3
  • Franky  · 技术社区  · 9 年前

    有了大量的财务数据,我想只选择每周的第一个非星期一来过滤。通常是星期二,但如果星期二是假日,有时也可以是星期三。

    这是我的代码,在大多数情况下都有效

    XLF <- quantmod::getSymbols("XLF", from = "2000-01-01", auto.assign = FALSE)
    
    library(tibble)
    library(lubridate)
    library(dplyr)
    xlf <- as_tibble(XLF) %>% rownames_to_column(var = "date") %>% 
             select(date, XLF.Adjusted)  
    xlf$date <- ymd(xlf$date)
    
    # We create Month, Week number and Days of the week columns
    # Then we remove all the Mondays
    xlf <- xlf %>% mutate(Year = year(date), Month = month(date), 
                          IsoWeek = isoweek(date), WDay = wday(date)) %>% 
                   filter(WDay != 2)
    
    # Creating another tibble just for ease of comparison
    xlf2 <- xlf %>% 
              group_by(Year, IsoWeek) %>% 
              filter(row_number() == 1) %>% 
              ungroup()
    

    也就是说,有些问题到目前为止我还无法解决。

    例如,问题是它跳过了“2002-12-31”,这是一个星期二,因为它被视为2003年第一个ISO周的一部分。 有几个类似的问题。
    我的问题是,在tidyverse(即不必使用xts/zoo课程)期间,我如何选择每周的第一个非星期一而不出现此类问题?

    1 回复  |  直到 9 年前
        1
  •  3
  •   Jeroen Boeye    9 年前

    你可以自己创造一个不断增加的周数。也许不是最优雅的解决方案,但对我来说效果很好。

    as_tibble(XLF) %>% 
      rownames_to_column(var = "date")%>% 
      select(date, XLF.Adjusted)%>%
      mutate(date = ymd(date),
             Year = year(date),
             Month = month(date),
             WDay = wday(date),
             WDay_label = wday(date, label = T))%>% 
      # if the weekday number is higher in the line above or 
      # if the date in the previous line is more than 6 days ago
      # the week number should be incremented
      mutate(week_increment  = (WDay < lag(WDay) | difftime(date, lag(date), unit = 'days') > 6))%>%
      # the previous line causes the first element to be NA due to 
      # the fact that the lag function can't find a line above
      # we correct this here by setting the first element to TRUE
      mutate(week_increment = ifelse(row_number() == 1,
                                     TRUE,
                                     week_increment))%>%
      # we can sum the boolean elements in a cumulative way to get a week number
      mutate(week_number = cumsum(week_increment))%>%
      filter(WDay != 2)%>%
      group_by(Year, week_number) %>% 
      filter(row_number() == 1)