代码之家 › 专栏 › 技术社区 › user2079550

如何使apply()函数更快?

apply r

user2079550 · 技术社区 · 9 年前

我有两个矩阵。我想使用第一个集合的列来过滤第二个集合,然后找到过滤集合的和。我使用了以下代码,它工作得非常好。

apply(firstMat,2,function(x) sum(secondMat[x,x]))

然而,数据集很大,我想找到一种替代方法,使过程更快。

以下是小规模的可复制示例:

firstMat<-matrix(c(T,F,T,F,F,T,T,F,F,F),nrow=5,ncol=2)
secondMat<-matrix(c(1,0,0,0,1,0,0,0,1,1,1,0,1,0,1,1,1,0,0,0,1,1,1,0,1),nrow=5,ncol=5)

如果你能帮助我,我将非常感激。

2 回复 | 直到 9 年前

Matthew Lundberg 9 年前

也许您的BLAS比显式循环快:

diag( t(firstMat) %*% secondMat %*% firstMat )

Swapnil 9 年前

你可以运行 apply 在多个集群上并行运行

firstMat<-matrix(c(T,F,T,F,F,T,T,F,F,F),nrow=5,ncol=2)
secondMat<-matrix(c(1,0,0,0,1,0,0,0,1,1,1,0,1,0,1,1,1,0,0,0,1,1,1,0,1),nrow=5,ncol=5)

# create custers
library(doSNOW)
cl <- makeCluster(2, type = "SOCK") # creates 2 clusters 
# can use detectCores() from package parallel to check number of cores in your machine
registerDoSNOW(cl)
clusterExport(cl,list("secondMat")) # need to export secndMAT to each cluster since will be used in cluster

# Option 1: Using parApply from package `parallel`
library(parallel)
parApply(cl,firstMat,2,function(x) sum(secondMat[x,x]))

# Option 2: Using aaply from package `plyr`
library(plyr)    
aaply(firstMat,2,function(x) sum(secondMat[x,x]),.parallel=T)

stopCluster(cl)

在这个小的可复制示例中,它没有显示任何速度改进,但我希望这两个选项都比 申请 对于大型矩阵

推荐文章

Amp · 使用R ggplot2删除geom_radial中axis.line和panel.border之间的空格

1 年前

Hard_Course · 用另一列中的值替换行的最后一个非NA条目

1 年前

Mark R · 使用geom_sf()删除地球仪上不需要的网格线

1 年前

Joe · 根据对工作日和本周早些时候的日期的了解,找到一个日期

1 年前

Ben · 统计向量中的单词在字符串中出现的频率

1 年前

TheCodeNovice · R中符号格式的尾随零和其他问题[重复]

1 年前

katefull06 · 在R中使用terra修改范围时,会为单独的SpatRaster重写范围

1 年前

dez93_2000 · 在R管道子功能中引用管道对象的当前状态

1 年前

accibio · 在ggplot2中为同一变量创建两个连续的颜色渐变比例

1 年前

Mankka · 如何在Ggplot2中绘制均匀的径向图

1 年前