我想按以下方式对数据进行分组
mcode
并为每个组创建两种不同类型的行。
以下是示例数据。
Cat1 Cat2 Cat3 mcode key pcode needed
1 C1 C2 C31 B3100 TRUE P001 P001
2 C1 C2 C31 B3100 FALSE P002 P002
3 C1 C2 C31 B5500 TRUE P003 P003
4 C1 C2 C31 B5500 FALSE P004 NA
5 C1 C2 C31 B5500 FALSE P005 NA
6 C1 C2 C32 B1000 TRUE P006 NA
7 C1 C2 C32 B1000 FALSE P007 P007
8 C1 C2 C32 B1000 FALSE P008 NA
9 C1 C2 C32 B1000 FALSE P009 P009
10 C1 C2 C32 B1000 FALSE P010 P010
对于每个组,我想获取类别值(
Cat1
,
Cat2
,
Cat3
)从行在哪里
key
是
TRUE
.
此外,我需要创建Python风格的列表字符串,将以下所有值组合在一起
pcode
和
needed
单独列,不包括
NA
价值观。
请注意
钥匙
列是
真的
什么时候
mcode
首次具有不同的值。
以下是预期的产出。
mcode Cat1 Cat2 Cat3 type extended_info
1 B1000 C1 C2 C32 pcode ['P006','P007','P008','P009','P010']
2 B1000 C1 C2 C32 needed ['P007','P009','P010']
3 B3100 C1 C2 C31 pcode ['P001','P002']
4 B3100 C1 C2 C31 needed ['P001','P002']
5 B5500 C1 C2 C31 pcode ['P003','P004','P005']
6 B5500 C1 C2 C31 needed ['P003']
这里有用于复制数据和输出的trible
df <- tribble(
~Cat1, ~Cat2, ~Cat3, ~mcode, ~key, ~pcode, ~needed,
"C1", "C2", "C31", "B3100", TRUE, "P001", "P001",
"C1", "C2", "C31", "B3100", FALSE, "P002", "P002",
"C1", "C2", "C31", "B5500", TRUE, "P003", "P003",
"C1", "C2", "C31", "B5500", FALSE, "P004", NA,
"C1", "C2", "C31", "B5500", FALSE, "P005", NA,
"C1", "C2", "C32", "B1000", TRUE, "P006", NA,
"C1", "C2", "C32", "B1000", FALSE, "P007", "P007",
"C1", "C2", "C32", "B1000", FALSE, "P008", NA,
"C1", "C2", "C32", "B1000", FALSE, "P009", "P009",
"C1", "C2", "C32", "B1000", FALSE, "P010", "P010"
)
expected_output <- tribble(
~mcode, ~Cat1, ~Cat2, ~Cat3, ~type, ~extended_info,
"B1000", "C1", "C2", "C32", "pcode", "['P006','P007','P008','P009','P010']",
"B1000", "C1", "C2", "C32", "needed", "['P007','P009','P010']",
"B3100", "C1", "C2", "C31", "pcode", "['P001','P002']",
"B3100", "C1", "C2", "C31", "needed", "['P001','P002']",
"B5500", "C1", "C2", "C31", "pcode", "['P003','P004','P005']",
"B5500", "C1", "C2", "C31", "needed", "['P003']"
)