data.frame
具有
否则
minus
和
plus
列转换为
factor
species <- c("a","a","a","b","b","b","c","c","c","d","d","d","e","e","e","f","f","f","g","h","h","h","i","i","i")
category <- c("h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","h","l","m","l","h","l","m","h","l","m")
minus <- c(31,14,260,100,70,200,91,152,842,16,25,75,60,97,300,125,80,701,104,70,7,124,24,47,251)
plus <- c(2,0,5,0,1,1,4,4,30,1,0,0,2,0,5,0,0,3,0,0,0,0,0,0,4)
df <- data.frame(species=species, category=category, minus=minus, plus=plus)
那么,我不确定是否有一个纯粹的
dplyr
方法(很高兴看到相反的结果),但我认为这是一个部分原因-
df_combinations <-
# create a df with all interactions
expand.grid(df$species, df$category, df$category)) %>%
# rename columns
`colnames<-`(c("species", "category1", "category2")) %>%
# 3 lines below:
# manage to only retain within a species, category(1 and 2) columns
# with different values
unique %>%
group_by(species) %>%
filter(category1 != category2) %>%
# cosmetics
arrange(species, category1, category2) %>%
ungroup() %>%
# prepare an empty column
mutate(p.value=NA)
# now we loop to fill your result data.frame
for (i in 1:nrow(df_combinations)){
# filter appropriate lines
cat1 <- filter(df,
species==df_combinations$species[i],
category==df_combinations$category1[i])
cat2 <- filter(df,
species==df_combinations$species[i],
category==df_combinations$category2[i])
# calculate the chisq.test and assign its p-value to the right line
df_combinations$p.value[i] <- chisq.test(c(cat1$minus, cat2$minus,
cat1$plus, cat2$plus))$p.value
}
数据框架
:
head(df_combinations)
# A tibble: 6 x 4
# A tibble: 6 x 4
# Groups: species [1]
species category1 category2 p.value
<fctr> <fctr> <fctr> <dbl>
1 a h l 3.290167e-11
2 a h m 1.225872e-134
3 a l h 3.290167e-11
4 a l m 5.824842e-150
5 a m h 1.225872e-134
6 a m l 5.824842e-150
检查第一行:
奇斯克。测试(c(31,14,2,0))$p.value
这是你想要的吗?