代码之家 › 专栏 › 技术社区 › RobertF

如何将函数应用于矩阵的每个元素?

mapply matrix function r

3

RobertF · 技术社区 · 7 月前

我正在努力将一个函数应用于矩阵的每个元素,一个下三角Jaccard相似性矩阵。

该函数应返回值>的矩阵值。7,并将其他元素重新分配为NA,从而更容易识别高度相似的二进制变量。理想情况下,矩阵结构得以保留。

我创建了一个简单的样本3x3矩阵,其中填充了随机值进行测试:

N <- 3 # Observations
N_vec <- 3 # Number of vectors
set.seed(123)
x1 <- runif(N * N_vec)
mat_x1 <- matrix(x1, ncol = N_vec)
mat_x1[upper.tri(mat_x1)] <- NA
diag(mat_x1) <- NA
mat_x1
              [,1]      [,2] [,3]
    [1,]        NA        NA   NA
    [2,] 0.7883051        NA   NA
    [3,] 0.4089769 0.0455565   NA

如何将以下函数应用于返回值>的每个矩阵元素;0.7?

y = (function(x) if (x > .7) { return(x) } else { return(NA) })

应用该函数后,我希望看到以下内容:

mat_x2
              [,1] [,2] [,3]
    [1,]        NA   NA   NA
    [2,] 0.7883051   NA   NA
    [3,]        NA   NA   NA

5 回复 | 直到 7 月前

1

5

Ronak Shah 7 月前

在这种情况下,您可以这样做:

mat_x1[mat_x1 <= .7] <- NA

#          [,1] [,2] [,3]
#[1,]        NA   NA   NA
#[2,] 0.7883051   NA   NA
#[3,]        NA   NA   NA

以防万一,这只是一个例子,你想应用某种函数变体 y 你可以做以下事情。首先,确保你的函数是矢量化的,可以处理多个值,在这种情况下,这就像更改一样简单 if 到 ifelse 然后将该函数应用于矩阵。

y = function(x) ifelse(x > .7, x, NA)
y(mat_x1)

2

4

Ben Bolker 7 月前

@RonakShah的答案更好,但为了完整性(例如,如果你有一个难以矢量化的函数),你可以使用 apply() 结束 二者都 矩阵的边距:

f <- function(x) if (!is.na(x) & x > .7) x else NA
apply(mat_x1, MARGIN = c(1,2), FUN = f)
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

3

1

Frederi ROSE 7 月前

N <- 3 # Observations
N_vec <- 3 # Number of vectors
set.seed(123)
x1 <- runif(N * N_vec)
mat_x1 <- matrix(x1, ncol = N_vec)
mat_x1[upper.tri(mat_x1)] <- NA
diag(mat_x1) <- NA
mat_x1

new_function <- function( mat_x1 ){
truth_mat <-mat_x1 >.7
truth_mat
newmat <- mat_x1 * truth_mat
newmat[newmat == 0] <- NA
return( newmat )
}

new_function (mat_x1)

这将返回:

          [,1]      [,2] [,3]
[1,]        NA        NA   NA
[2,] 0.7883051        NA   NA
[3,] 0.4089769 0.0455565   NA
          [,1] [,2] [,3]
[1,]        NA   NA   NA
[2,] 0.7883051   NA   NA
[3,]        NA   NA   NA

4

1

jpsmith 7 月前

其他答案都很好——作为一个给定这个特定问题目标的变体,如果你的最终目标是识别值大于某个阈值(即0.7)的行/列组合,你可以使用以下公式返回这些组合的索引 which 并且消除了手动查看矩阵的需要:

which(mat_x1 > 0.7, arr.ind = TRUE)
#      row col
# [1,]   2   1

如果你的矩阵有行和列名,并且想变得花哨,你可以创建一个小助手函数来让一切变得漂亮。以下是一个使用新矩阵的示例,其中水果作为行名,动物作为列名:

mat2 <- matrix(c(0.75, 0.75, 0.2, 
                 0.3, 0.4, 0.75, 
                 0.5, 0.3, 0.9), 
               nrow = 3, 
               dimnames = list(c("Apples", "Oranges", "Bananas"), 
                               c("Dog", "Cat", "Hampster")))

thresh <- 0.70

myFun <- function(mtrx, indcs){
  data.frame(row = rownames(mtrx)[indcs[, "row"]],
             column = colnames(mtrx)[indcs[, "col"]],
             value = mtrx[indcs])
  }


myFun(mat2, which(mat2 >= thresh, arr.ind = TRUE))

#      row   column value
# 1  Apples      Dog  0.75
# 2 Oranges      Dog  0.75
# 3 Bananas      Cat  0.75
# 4 Bananas Hampster  0.90

5

0

Rui Barradas 7 月前

虽然有一个公认的答案,但我认为值得发布 is.na<- 解决方案。
在RHS上,你有一个索引向量,给出了要变成的值 NA ,在这种情况下,向量是您想要的逻辑条件(或其否定)。

# this is the question's condition, negated
# is.na(mat_x1) <- !(mat_x1 > 0.7)
#
is.na(mat_x1) <- mat_x1 <= 0.7
mat_x1
#>           [,1] [,2] [,3]
#> [1,]        NA   NA   NA
#> [2,] 0.7883051   NA   NA
#> [3,]        NA   NA   NA

^{创建于2024年12月16日

reprex v2.1.1}