我有很多
data.frames
(448)具有相同的列名(共9个),如下所示:
V1 V2 V4 ... V9
ENSG00000000003.15 TSPAN6 7095
ENSG00000000005.6 TNMD 4355
. . .
. . .
我想再创建一个
data.frame
,保持前2列(V1和V2),每列都是相同的
数据帧
,并合并V4列(每个列都不同
数据帧
)从所有
数据帧
s、 其余的列应该被排除在外。
如果可能的话,我想将V4列重命名为“sample1”、“sample2”等,直到448。
因此,最终的数据帧应该是:
V1 V2 V4_1 V4_2 ... V4_448
ENSG00000000003.15 TSPAN6 7095 3856 .
ENSG00000000005.6 TNMD 4355 2976 .
. . . . .
. . . . .
我已经这样做了:
reader <- function(f){
read.table(f, sep='\t', skip=6, header=FALSE)
}
files <- list.files(path,
recursive=TRUE, full.names=TRUE)
myfilelist <- lapply(files, reader)
但我不知道如何仅组合这些选定的列
这是输出
dput(lapply(myfilelist[1:2], head))
:
myfilelist <- list(structure(list(V1 = c("ENSG00000000003.15", "ENSG00000000005.6",
"ENSG00000000419.13", "ENSG00000000457.14", "ENSG00000000460.17",
"ENSG00000000938.13"), V2 = c("TSPAN6", "TNMD", "DPM1", "SCYL3",
"C1orf112", "FGR"), V3 = c("protein_coding", "protein_coding",
"protein_coding", "protein_coding", "protein_coding", "protein_coding"
), V4 = c(7094L, 2L, 4355L, 1149L, 372L, 585L), V5 = c(3573L,
1L, 2201L, 953L, 553L, 281L), V6 = c(3521L, 1L, 2154L, 883L,
579L, 308L), V7 = c(59.9764, 0.052, 138.3704, 6.4018, 2.3896,
6.6335), V8 = c(20.5827, 0.0178, 47.4859, 2.197, 0.8201, 2.2765
), V9 = c(22.2037, 0.0192, 51.2256, 2.37, 0.8847, 2.4558)), row.names = c(NA,
6L), class = "data.frame"), structure(list(V1 = c("ENSG00000000003.15",
"ENSG00000000005.6", "ENSG00000000419.13", "ENSG00000000457.14",
"ENSG00000000460.17", "ENSG00000000938.13"), V2 = c("TSPAN6",
"TNMD", "DPM1", "SCYL3", "C1orf112", "FGR"), V3 = c("protein_coding",
"protein_coding", "protein_coding", "protein_coding", "protein_coding",
"protein_coding"), V4 = c(2616L, 23L, 3746L, 1288L, 510L, 1578L
), V5 = c(1369L, 9L, 1876L, 1015L, 681L, 797L), V6 = c(1250L,
14L, 1871L, 984L, 693L, 782L), V7 = c(16.8063, 0.4541, 90.4417,
5.4531, 2.4895, 13.5969), V8 = c(4.8615, 0.1314, 26.1617, 1.5774,
0.7201, 3.9331), V9 = c(6.0158, 0.1625, 32.3733, 1.9519, 0.8911,
4.867)), row.names = c(NA, 6L), class = "data.frame"))