代码之家  ›  专栏  ›  技术社区  ›  MYaseen208

R与Python的不同多项式回归系数

  •  1
  • MYaseen208  · 技术社区  · 7 年前

    我得到了不同的多项式回归系数 R Python .

        X <- c(0,0, 10, 10, 20, 20)
        Y <- c(5, 7, 15, 17, 9, 11)
        fm1 <- lm(Y~X+I(X^2))
        summary(fm1)
        Call:
        lm(formula = Y ~ X + I(X^2))
    
        Residuals:
         1  2  3  4  5  6 
        -1  1 -1  1 -1  1 
    
        Coefficients:
                    Estimate Std. Error t value Pr(>|t|)   
        (Intercept)  6.00000    1.00000   6.000  0.00927 **
        X            1.80000    0.25495   7.060  0.00584 **
        I(X^2)      -0.08000    0.01225  -6.532  0.00729 **
        ---
        Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    
        Residual standard error: 1.414 on 3 degrees of freedom
        Multiple R-squared:  0.9441,    Adjusted R-squared:  0.9068 
        F-statistic: 25.33 on 2 and 3 DF,  p-value: 0.01322
    
        anova(fm1)
        Analysis of Variance Table
    
        Response: Y
                  Df Sum Sq Mean Sq F value   Pr(>F)   
        X          1 16.000  16.000   8.000 0.066276 . 
        I(X^2)     1 85.333  85.333  42.667 0.007292 **
        Residuals  3  6.000   2.000                    
        ---
        Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    

    蟒蛇

    Nitro = [0, 0, 10, 10, 20, 20]
    Yield = [5, 7, 15, 17, 9, 11]
    import pandas as pd
    
    df3 = pd.DataFrame(
    {
        "Nitrogen": Nitro,
         "Yield": Yield
    }
    )
    
    from statsmodels.formula.api import ols
    from statsmodels.stats.anova import anova_lm
    
    Reg3 = ols("Yield ~ Nitrogen + I(Nitrogen^2)", data = df3)
    Fit3 = Reg3.fit()
    print(Fit3.summary())
    
                               OLS Regression Results                            
    ==============================================================================
    Dep. Variable:                  Yield   R-squared:                       0.944
    Model:                            OLS   Adj. R-squared:                  0.907
    Method:                 Least Squares   F-statistic:                     25.33
    Date:                Fri, 27 Jul 2018   Prob (F-statistic):             0.0132
    Time:                        19:25:22   Log-Likelihood:                -8.5136
    No. Observations:                   6   AIC:                             23.03
    Df Residuals:                       3   BIC:                             22.40
    Df Model:                           2                                         
    Covariance Type:            nonrobust                                         
    ===================================================================================
                          coef    std err          t      P>|t|      [0.025      0.975]
    -----------------------------------------------------------------------------------
    Intercept          10.0000      0.935     10.690      0.002       7.023      12.977
    Nitrogen            2.2000      0.314      7.001      0.006       1.200       3.200
    I(Nitrogen ^ 2)    -2.0000      0.306     -6.532      0.007      -2.974      -1.026
    ==============================================================================
    Omnibus:                          nan   Durbin-Watson:                   3.333
    Prob(Omnibus):                    nan   Jarque-Bera (JB):                1.000
    Skew:                           0.000   Prob(JB):                        0.607
    Kurtosis:                       1.000   Cond. No.                         30.4
    ==============================================================================
    
    
    print(anova_lm(Fit3))
                      df     sum_sq    mean_sq          F    PR(>F)
    Nitrogen         1.0  16.000000  16.000000   8.000000  0.066276
    I(Nitrogen ^ 2)  1.0  85.333333  85.333333  42.666667  0.007292
    Residual         3.0   6.000000   2.000000        NaN       NaN
    

    问题

    • 为什么在R和Python中得到不同的回归系数?
    1 回复  |  直到 7 年前
        1
  •  5
  •   MrFlick    7 年前

    在python中 ^ 是按位或运算符。你想要一个指数。尝试

    Reg3 = ols("Yield ~ Nitrogen + I(Nitrogen**2)", data = df3)