对于每个MRN ID,我想选择与多个其他列(即基准、六个月、十二个月和二十四个月)中列出的日期最接近的各个列(即hdl、ldl和vldl)相关联的日期(在OBObservation_date中)。因此,例如,对于报告HDL数据的行,我想评估OBSEVATION_DATE中的日期,以找到最接近Base、SixMonths等的日期。应该为该示例创建4个新列(列出的4个时间点各一个)。此外,对于该示例,我希望创建另一组4列,其值与其他4列新列中标识的日期相关联。因此,对于我的示例数据,总共应该有24个新列(见下文)。
该算法应仅查找给定列中有数据的观测值,作为最接近基准、六个月、十二个月和二十四个月的日期的潜在选择。例如,如果一行没有HDL的数据,则不应将其视为从中选择日期的选择池的一部分。并非所有为某人提供的日期都会在我关心的所有列中都有数据。因此,仅仅依靠一种算法为一个人选择日期是没有用的,而不管该日期是否有该变量的数据。
HDL OBSERVATION_DATE date closest to Base,
HDL OBSERVATION_DATE date closest to SixMonths,
HDL OBSERVATION_DATE date closest to TwelveMonths,
HDL OBSERVATION_DATE date closest to TwentyFourMonths,
HDL value associated with OBSERVATION_DATE date closest to Base,
HDL value associated with OBSERVATION_DATE date closest to SixMonths,
HDL value associated with OBSERVATION_DATE date closest to TwelveMonths,
HDL value associated with OBSERVATION_DATE date closest to TwentyFourMonths,
LDL OBSERVATION_DATE date closest to Base,
LDL OBSERVATION_DATE date closest to SixMonths,
LDL OBSERVATION_DATE date closest to TwelveMonths,
LDL OBSERVATION_DATE date closest to TwentyFourMonths,
LDL value associated with OBSERVATION_DATE date closest to Base,
LDL value associated with OBSERVATION_DATE date closest to SixMonths,
LDL value associated with OBSERVATION_DATE date closest to TwelveMonths,
LDL value associated with OBSERVATION_DATE date closest to TwentyFourMonths,
VLDL OBSERVATION_DATE date closest to Base,
VLDL OBSERVATION_DATE date closest to SixMonths,
VLDL OBSERVATION_DATE date closest to TwelveMonths,
VLDL OBSERVATION_DATE date closest to TwentyFourMonths,
VLDL value associated with OBSERVATION_DATE date closest to Base,
VLDL value associated with OBSERVATION_DATE date closest to SixMonths,
VLDL value associated with OBSERVATION_DATE date closest to TwelveMonths,
VLDL value associated with OBSERVATION_DATE date closest to TwentyFourMonths,
以下是我的数据:
structure(list(MRN = c(15842, 15842, 15842, 19463, 19463, 19463,
19463, 19463, 19463, 19463, 19463, 19463, 19463, 19463, 19463,
19463, 19463, 19463, 19463, 19463, 19463, 34025, 34025, 34025,
34025, 34025, 34025, 37465, 37465, 37465, 68874, 68874, 68874,
76133, 76133, 76133, 76133, 76133, 76133, 76133, 76133, 76133,
76133, 76133, 76133, 76133, 76133, 76133, 76133, 76133), OBSERVATION_DATE = structure(c(18289,
18289, 18289, 16073, 16073, 16073, 16434, 16434, 16434, 16536,
16536, 16536, 16821, 16821, 16821, 17196, 17196, 17196, 17604,
17604, 17604, 19114, 19114, 19114, 19338, 19338, 19338, 19060,
19060, 19060, 19730, 19730, 19730, 17326, 17326, 17326, 17331,
17331, 17331, 17333, 17333, 17333, 17336, 17336, 17336, 17339,
17339, 17339, 17347, 17347), class = "Date"), HDL = c(NA, 47,
NA, 40, NA, NA, NA, 43, NA, 38, NA, NA, NA, 41, NA, NA, 48, NA,
NA, 45, NA, NA, 44, NA, NA, 42, NA, NA, NA, 56, 16, NA, NA, NA,
34, NA, 34, NA, NA, 31, NA, NA, 33, NA, NA, NA, 32, NA, NA, NA
), LDL = c(NA, NA, 83, NA, 92, NA, 107, NA, NA, NA, NA, 112,
93, NA, NA, 96, NA, NA, 109, NA, NA, NA, NA, 76, 56, NA, NA,
141, NA, NA, NA, NA, 49, 55, NA, NA, NA, NA, 57, NA, 53, NA,
NA, NA, 59, 55, NA, NA, NA, 55), VLDL = c(14, NA, NA, NA, NA,
46, NA, NA, 30, NA, 30, NA, NA, NA, 28, NA, NA, 20, NA, NA, 28,
17, NA, NA, NA, NA, 21, NA, 35, NA, NA, 15, NA, NA, NA, 24, NA,
24, NA, NA, NA, 20, NA, 23, NA, NA, NA, 26, 22, NA), Base = structure(c(17647,
17647, 17647, 17032, 17032, 17032, 17032, 17032, 17032, 17032,
17032, 17032, 17032, 17032, 17032, 17032, 17032, 17032, 17032,
17032, 17032, 18577, 18577, 18577, 18577, 18577, 18577, 18894,
18894, 18894, 19431, 19431, 19431, 16751, 16751, 16751, 16751,
16751, 16751, 16751, 16751, 16751, 16751, 16751, 16751, 16751,
16751, 16751, 16751, 16751), class = "Date"), SixMonths = structure(c(17830,
17830, 17830, 17215, 17215, 17215, 17215, 17215, 17215, 17215,
17215, 17215, 17215, 17215, 17215, 17215, 17215, 17215, 17215,
17215, 17215, 18760, 18760, 18760, 18760, 18760, 18760, 19077,
19077, 19077, 19614, 19614, 19614, 16934, 16934, 16934, 16934,
16934, 16934, 16934, 16934, 16934, 16934, 16934, 16934, 16934,
16934, 16934, 16934, 16934), class = "Date"), TwelveMonths = structure(c(18012,
18012, 18012, 17397, 17397, 17397, 17397, 17397, 17397, 17397,
17397, 17397, 17397, 17397, 17397, 17397, 17397, 17397, 17397,
17397, 17397, 18942, 18942, 18942, 18942, 18942, 18942, 19259,
19259, 19259, 19796, 19796, 19796, 17116, 17116, 17116, 17116,
17116, 17116, 17116, 17116, 17116, 17116, 17116, 17116, 17116,
17116, 17116, 17116, 17116), class = "Date"), TwentyFourMonths = structure(c(18377,
18377, 18377, 17762, 17762, 17762, 17762, 17762, 17762, 17762,
17762, 17762, 17762, 17762, 17762, 17762, 17762, 17762, 17762,
17762, 17762, 19307, 19307, 19307, 19307, 19307, 19307, 19624,
19624, 19624, 20161, 20161, 20161, 17481, 17481, 17481, 17481,
17481, 17481, 17481, 17481, 17481, 17481, 17481, 17481, 17481,
17481, 17481, 17481, 17481), class = "Date")), row.names = c(NA,
-50L), class = "data.frame")