代码之家 › 专栏 › 技术社区 › Franky

lubridate-选择每周的第一个非星期一。

tidyverse lubridate dplyr r

Franky · 技术社区 · 9 年前

有了大量的财务数据,我想只选择每周的第一个非星期一来过滤。通常是星期二,但如果星期二是假日,有时也可以是星期三。

这是我的代码,在大多数情况下都有效

XLF <- quantmod::getSymbols("XLF", from = "2000-01-01", auto.assign = FALSE)

library(tibble)
library(lubridate)
library(dplyr)
xlf <- as_tibble(XLF) %>% rownames_to_column(var = "date") %>% 
         select(date, XLF.Adjusted)  
xlf$date <- ymd(xlf$date)

# We create Month, Week number and Days of the week columns
# Then we remove all the Mondays
xlf <- xlf %>% mutate(Year = year(date), Month = month(date), 
                      IsoWeek = isoweek(date), WDay = wday(date)) %>% 
               filter(WDay != 2)

# Creating another tibble just for ease of comparison
xlf2 <- xlf %>% 
          group_by(Year, IsoWeek) %>% 
          filter(row_number() == 1) %>% 
          ungroup()

也就是说,有些问题到目前为止我还无法解决。

例如,问题是它跳过了“2002-12-31”,这是一个星期二,因为它被视为2003年第一个ISO周的一部分。有几个类似的问题。
我的问题是,在tidyverse(即不必使用xts/zoo课程)期间,我如何选择每周的第一个非星期一而不出现此类问题?

1 回复 | 直到 9 年前

Jeroen Boeye 9 年前

你可以自己创造一个不断增加的周数。也许不是最优雅的解决方案,但对我来说效果很好。

as_tibble(XLF) %>% 
  rownames_to_column(var = "date")%>% 
  select(date, XLF.Adjusted)%>%
  mutate(date = ymd(date),
         Year = year(date),
         Month = month(date),
         WDay = wday(date),
         WDay_label = wday(date, label = T))%>% 
  # if the weekday number is higher in the line above or 
  # if the date in the previous line is more than 6 days ago
  # the week number should be incremented
  mutate(week_increment  = (WDay < lag(WDay) | difftime(date, lag(date), unit = 'days') > 6))%>%
  # the previous line causes the first element to be NA due to 
  # the fact that the lag function can't find a line above
  # we correct this here by setting the first element to TRUE
  mutate(week_increment = ifelse(row_number() == 1,
                                 TRUE,
                                 week_increment))%>%
  # we can sum the boolean elements in a cumulative way to get a week number
  mutate(week_number = cumsum(week_increment))%>%
  filter(WDay != 2)%>%
  group_by(Year, week_number) %>% 
  filter(row_number() == 1)

推荐文章

geoscience123 · 如何计算R中一个表列与另一个数据帧的匹配数?

1 年前

Daniel Estévez · 扩展数据帧以包含不存在的值

1 年前

Sean · 创建列,在其他列中给出下一个相等或更小的值

1 年前

Pete · 通过将相应变量相乘并求和来创建新变量

1 年前

Evelyn Abbott · R: 根据另一个数据帧中列中的值范围分配列值

1 年前

arnyeinstein · 在mutate with paste中使用带有字符串的向量

1 年前

ThomasIsCoding · 使用dplyr根据外部列筛选数据

1 年前

bill999 · 如何创建显示观察值所处百分位数范围的变量

1 年前

Hydro · R中缺失月份的NA完整数据序列?

1 年前

Alex Holcombe · 为数据帧创建一个新的计算列,每行原始数据帧有多个值

1 年前