代码之家 › 专栏 › 技术社区 › CA Burns

贝叶斯统计:在R中模拟上述实验10000次,每次记录最长运行的长度

naivebayes bayesian r

CA Burns · 技术社区 · 7 年前

我试图找到10000次模拟中掷硬币30次的最长距离的平均值。我需要在R中模拟上述10000次实验,每次记录最长运行的长度。

coin <- sample(c("H", "T"), 10000, replace = TRUE)
table(coin) 
head(coin, n = 30)
rle(c("H", "T", "T", "H", "H", "H", "H", "H", "T", "H"))
coin.rle <- rle(coin)
str(coin.rle)

如何找到10000次模拟中最长运行的平均值?

2 回复 | 直到 7 年前

kangaroo_cliff 7 年前

我认为以下就是你想要的。

n_runs <- 10000
max_runs <- numeric(n_runs)
for (j in 1:n_runs) {
 coin <- sample(c("H", "T"), 30, replace = TRUE) 
 max_runs[j] <- max(rle(coin)$length)
}
mean(max_runs)

coin (如 coin[20] )和 rle 它的( rle(coin[20]) max(rle(coin)$length) 提供最大跑步距离。

编辑:以下可能更快

len <- 30
times <- 10000

flips <- sample(c("H", "T"), len * times, replace = TRUE) 
runs <- sapply(split(flips, ceiling(seq_along(flips)/len)),
                    function(x) max(rle(x)$length))
mean(runs) # average of max runs
sum(runs >= 7)/ times # number of runs >= 7

mfidino 7 年前

所有的抛硬币都是相互独立的(即一次抛硬币的结果不会影响另一次抛硬币)。因此,我们可以一次翻转所有模拟的所有硬币,然后以这样的方式格式化,使总结每个30次翻转试验变得更简单。下面是我将如何处理这个问题。

# do all of the flips at once, this is okay because each flip
# is independent
coin_flips <- sample(c("heads", "tails"), 30 * 10000, replace = TRUE)

# put them into a 10000 by 30 matrix, each row
# indicates one 'simulation'
coin_matrix <- matrix(coin_flips, ncol = 30, nrow = 10000)

# we now want to iterate through each row using apply,
# to do so we need to make a function to apply to each
# row. This gets us the longest run over a single
# simulation
get_long_run <- function(x) {
  max(rle(x)$length)
}

# apply this function to each row
longest_runs <- apply(coin_matrix, 1, get_long_run)

# get the number of simulations that had a max run >= 7. Divide this
# by the number of simulations to get the probability of this occuring.
sum(longest_runs >= 7)/nrow(coin_matrix)

你应该得到18-19%之间的值,但每次你尝试这个模拟时,这个值都会有一点变化。

推荐文章

Amp · 使用R ggplot2删除geom_radial中axis.line和panel.border之间的空格

5 月前

Hard_Course · 用另一列中的值替换行的最后一个非NA条目

5 月前

Mark R · 使用geom_sf()删除地球仪上不需要的网格线

5 月前

Joe · 根据对工作日和本周早些时候的日期的了解,找到一个日期

5 月前

Ben · 统计向量中的单词在字符串中出现的频率

5 月前

TheCodeNovice · R中符号格式的尾随零和其他问题[重复]

5 月前

katefull06 · 在R中使用terra修改范围时,会为单独的SpatRaster重写范围

5 月前

dez93_2000 · 在R管道子功能中引用管道对象的当前状态

5 月前

accibio · 在ggplot2中为同一变量创建两个连续的颜色渐变比例

6 月前

Mankka · 如何在Ggplot2中绘制均匀的径向图

6 月前