我有一个R中的数据集:
vec = c(200,300,400,500,600,100)
char1 = c("a","a","a","b","b","a")
char2 = c("c","c","d","c","d","d")
df2 = tibble(vec,char1,char2);df2
# A tibble: 6 Ã 3
vec char1 char2
<dbl> <chr> <chr>
1 200 a c
2 300 a c
3 400 a d
4 500 b c
5 600 b d
6 100 a d
如果我想计算每个char1变量的列向量的平均值,可以使用以下方法来完成:
df2%>%group_by(char1)%>%
summarise(mean(vec))
lm(df2$vec~df2$char1-1)
对于char2变量:
df2%>%group_by(char2)%>%
summarise(mean(vec))
lm(df2$vec~df2$char2-1)
结果分别与这两种情况的线性回归系数相匹配。
但如果我想计算每个char1和char2,我在R中这样做:
df2%>%group_by(char1,char2)%>%
summarise(mean(vec))
这两个变量的线性回归等价性是什么?
有什么帮助吗?