代码之家  ›  专栏  ›  技术社区  ›  Mal_a

添加日期为明年1月的行

r
  •  1
  • Mal_a  · 技术社区  · 6 年前

    我确实有很复杂的案子要解决。让我以一个例子为基础来解释一下,所以我们从下表开始:

     Datum Urlaub_geplannt
    1 2018-10            1410
    2 2018-11             940
    3 2018-12             470
    
    
    structure(list(Datum = structure(1:3, .Label = c("2018-10", "2018-11", 
    "2018-12"), class = "factor"), Urlaub_geplannt = c(1410, 940, 
    470)), .Names = c("Datum", "Urlaub_geplannt"), row.names = c(NA, 
    -3L), class = "data.frame")
    

    我希望在明年1月之前将新行添加到此表中(基准列),并且所有其他列都应该用0填充。在这种情况下,最后一张表应该如下所示:

     Datum Urlaub_geplannt
    1 2018-10            1410
    2 2018-11             940
    3 2018-12             470
    4 2019-01             0
    

    然而,随着我的数据的变化(它实际上在 Shiny )

    我的意思是,如果我有2019年以后的新数据行,我会自动得到2020年1月的“结束日期”。谢谢你的帮助!

    3 回复  |  直到 6 年前
        1
  •  1
  •   Ronak Shah    6 年前

    基本R方法

    get_date_till_Jan <- function(df) {
      #Convert the character dates to actual Date objects
      max_Date <- max(as.Date(paste0(df$Datum, "-01")))
    
      #Get the date for next year January
      next_Jan <- as.Date(paste0(as.numeric(format(max_Date, "%Y")) + 1, "-01-01"))
    
      #Create a monthly sequence from the max date to next Jan date
      new_date <- format(seq(max_Date, next_Jan, by = "month")[-1], "%Y-%m")
    
      #Create a new dataframe with all values as 0 and change only the Datum 
      #column with new_date and rbind it to original dataframe
      rbind(df, transform(data.frame(matrix(0, nrow = length(new_date), 
          ncol = ncol(df), dimnames = list(NULL, names(df)))), 
          Datum = new_date))
    }
    
    df <- get_date_till_Jan(df)
    df
    
    #    Datum Urlaub_geplannt
    #1 2018-10            1410
    #2 2018-11             940
    #3 2018-12             470
    #4 2019-01               0
    

    这适用于任何数量的列

    df['another_col'] = 1:4
    get_date_till_Jan(df)
    
    
    #     Datum Urlaub_geplannt another_col
    #1  2018-10            1410           1
    #2  2018-11             940           2
    #3  2018-12             470           3
    #4  2019-01               0           4
    #5  2019-02               0           0
    #6  2019-03               0           0
    #7  2019-04               0           0
    #8  2019-05               0           0
    #9  2019-06               0           0
    #10 2019-07               0           0
    #11 2019-08               0           0
    #12 2019-09               0           0
    #13 2019-10               0           0
    #14 2019-11               0           0
    #15 2019-12               0           0
    #16 2020-01               0           0
    
        2
  •  1
  •   RLave    6 年前

    解决方案 dplyr 和A full_join :

    library(dplyr)
    library(lubridate) # for ymd() function
    
    
    d <- d %>% 
      mutate(Datum = paste0(Datum,"-01"),
             Datum = ymd(Datum)) # correct Date format
    
    min_year <- year(min(d$Datum))
    min_date <- min(d$Datum)
    
    # create a data.frame of possible dates
    fill_dates <- data.frame(Datum = seq.Date(
      min_date, # min date avaiable
      as.Date(paste0(min_year+1,"-01-01")), # until first Jan next year
      by = "month"))
    

    现在我们可以加入这两个组织了 data.frames :

    d %>% 
      full_join(fill_dates, by="Datum") %>% # full_join of the two tables
      # the full_join will add all new row not present in d originally, with NA
      mutate(Urlaub_geplannt = ifelse(is.na(Urlaub_geplannt), 0, Urlaub_geplannt))
    
    #       Datum Urlaub_geplannt
    # 1 2018-10-01            1410
    # 2 2018-11-01             940
    # 3 2018-12-01             470
    # 4 2019-01-01               0
    

    数据:

    d <- structure(list(Datum = structure(c("2018-10", "2018-11", 
                                                          "2018-12"), class = "character"), Urlaub_geplannt = c(1410, 940, 
                                                                                                             470)), .Names = c("Datum", "Urlaub_geplannt"), row.names = c(NA, 
                                                                                                                                                                          -3L), class = "data.frame")
    
        3
  •  1
  •   amrrs    6 年前
    df <- structure(list(Datum = structure(1:3, .Label = c("2018-10", "2018-11", 
                                                           "2018-12"), class = "factor"), Urlaub_geplannt = c(1410, 940, 
                                                                                                              470)), .Names = c("Datum", "Urlaub_geplannt"), row.names = c(NA, 
                                                                                                                                                                           -3L), class = "data.frame")
    
    
    
    
    Datum <- format(seq.Date(as.Date(paste0(df$Datum[nrow(df)],"-01")),
                             as.Date(paste0(substring(seq.Date(as.Date(paste0(as.character(df$Datum[1]),"-01")), 
                                                               length = 2,
                                                               by = 'year')[2],1,4),"-01-01")),
                             by = "month"
    
    ),"%Y-%m")
    
    
    new_df <- data.frame(Datum  = Datum, Urlaub_geplannt = rep(0,length(Datum)))
    
    
    total_df <- rbind(df,new_df)
    
    total_df
    #>     Datum Urlaub_geplannt
    #> 1 2018-10            1410
    #> 2 2018-11             940
    #> 3 2018-12             470
    #> 4 2018-12               0
    #> 5 2019-01               0