代码之家  ›  专栏  ›  技术社区  ›  78282219

Python:简单语言的超级词典

  •  1
  • 78282219  · 技术社区  · 7 年前

    我正试图建立一个超级字典,其中包含了一些较低层次的图书馆

    概念

    回归公式

    Y_i - Y_i-1 = A + B(X_i - X_i-1) + E
    

    数据

    Note: Y = Historic Rate
    
    df = pd.DataFrame(np.random.randint(low=0, high=10, size=(100,17)), 
                  columns=['Historic Rate', 'Overnight', '1M', '3M', '6M','1Y','2Y','3Y','4Y','5Y','6Y','7Y','8Y','9Y','10Y','12Y','15Y'])
    

    迄今为止的代码

    #Import packages required for the analysis
    
    import pandas as pd
    import numpy as np
    import statsmodels.api as sm
    
    def Simulation(TotalSim,j):
        #super dictionary to hold all iterations of the loop
        Super_fit_d = {}
        for i in range(1,TotalSim):
            #Create a introductory loop to run the first set of regressions
            #Each loop produces a univariate regression
            #Each loop has a fixed lag of i
    
            fit_d = {}  # This will hold all of the fit results and summaries
            for col in [x for x in df.columns if x != 'Historic Rate']:
                Y = df['Historic Rate'] - df['Historic Rate'].shift(1)
                # Need to remove the NaN for fit
                Y = Y[Y.notnull()]
    
                X = df[col] - df[col].shift(i)
                X = X[X.notnull()]
                #Y now has more observations than X due to lag, drop rows to match
                Y = Y.drop(Y.index[0:i-1])
    
                if j = 1:
                    X = sm.add_constant(X)  # Add a constant to the fit
    
                fit_d[col] = sm.OLS(Y,X).fit()
            #append the dictionary for each lag onto the super dictionary
            Super_fit_d[lag_i] = fit_d
    
    #Check the output for one column
    fit_d['Overnight'].summary()
    
    #Check the output for one column in one segment of the super dictionary
    Super_fit_d['lag_5'].fit_d['Overnight'].summary()
    
    Simulation(11,1)
    

    我似乎在用每个循环覆盖我的字典,并且我没有正确地评估I,以将迭代索引为lag\u 1、lag\u 2、lag\u 3等等。我该如何解决这个问题?

    1 回复  |  直到 7 年前
        1
  •  1
  •   Andrew    7 年前

    这里有几个问题:

    1. you sometimes use i and sometimes lag_i, but only i is defined. I changed all to lag_i for consistency
    2. if j = 1 is incorrect syntax. You need if j == 1
    3. You need to return fit_d so that it persists after your loop

    我是通过应用这些变化来完成的

    import pandas as pd
    import numpy as np
    import statsmodels.api as sm
    
    df = pd.DataFrame(np.random.randint(low=0, high=10, size=(100,17)), 
                  columns=['Historic Rate', 'Overnight', '1M', '3M', '6M','1Y','2Y','3Y','4Y','5Y','6Y','7Y','8Y','9Y','10Y','12Y','15Y'])
    
    def Simulation(TotalSim,j):
        Super_fit_d = {}
        for lag_i in range(1,TotalSim):
            #Create a introductory loop to run the first set of regressions
            #Each loop produces a univariate regression
            #Each loop has a fixed lag of i
    
            fit_d = {}  # This will hold all of the fit results and summaries
            for col in [x for x in df.columns if x != 'Historic Rate']:
                Y = df['Historic Rate'] - df['Historic Rate'].shift(1)
                # Need to remove the NaN for fit
                Y = Y[Y.notnull()]
    
                X = df[col] - df[col].shift(lag_i)
                X = X[X.notnull()]
                #Y now has more observations than X due to lag, drop rows to match
                Y = Y.drop(Y.index[0:lag_i-1])
    
                if j == 1:
                    X = sm.add_constant(X)  # Add a constant to the fit
    
                fit_d[col] = sm.OLS(Y,X).fit()
            #append the dictionary for each lag onto the super dictionary
          #  return fit_d
                Super_fit_d[lag_i] = fit_d
        return Super_fit_d
    
    
    
    test_dict = Simulation(11,1)
    

    第一滞后

    test_dict[1]['Overnight'].summary()
    
    Out[76]: 
    <class 'statsmodels.iolib.summary.Summary'>
    """
                                OLS Regression Results                            
    ==============================================================================
    Dep. Variable:          Historic Rate   R-squared:                       0.042
    Model:                            OLS   Adj. R-squared:                  0.033
    Method:                 Least Squares   F-statistic:                     4.303
    Date:                Fri, 28 Sep 2018   Prob (F-statistic):             0.0407
    Time:                        11:15:13   Log-Likelihood:                -280.39
    No. Observations:                  99   AIC:                             564.8
    Df Residuals:                      97   BIC:                             570.0
    Df Model:                           1                                         
    Covariance Type:            nonrobust                                         
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    const         -0.0048      0.417     -0.012      0.991      -0.833       0.823
    Overnight      0.2176      0.105      2.074      0.041       0.009       0.426
    ==============================================================================
    Omnibus:                        1.449   Durbin-Watson:                   2.756
    Prob(Omnibus):                  0.485   Jarque-Bera (JB):                1.180
    Skew:                           0.005   Prob(JB):                        0.554
    Kurtosis:                       2.465   Cond. No.                         3.98
    ==============================================================================
    
    Warnings:
    [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
    """
    

    test_dict[2]['Overnight'].summary()
    
    Out[77]: 
    <class 'statsmodels.iolib.summary.Summary'>
    """
                                OLS Regression Results                            
    ==============================================================================
    Dep. Variable:          Historic Rate   R-squared:                       0.001
    Model:                            OLS   Adj. R-squared:                 -0.010
    Method:                 Least Squares   F-statistic:                   0.06845
    Date:                Fri, 28 Sep 2018   Prob (F-statistic):              0.794
    Time:                        11:15:15   Log-Likelihood:                -279.44
    No. Observations:                  98   AIC:                             562.9
    Df Residuals:                      96   BIC:                             568.0
    Df Model:                           1                                         
    Covariance Type:            nonrobust                                         
    ==============================================================================
                     coef    std err          t      P>|t|      [0.025      0.975]
    ------------------------------------------------------------------------------
    const          0.0315      0.428      0.074      0.941      -0.817       0.880
    Overnight      0.0291      0.111      0.262      0.794      -0.192       0.250
    ==============================================================================
    Omnibus:                        2.457   Durbin-Watson:                   2.798
    Prob(Omnibus):                  0.293   Jarque-Bera (JB):                1.735
    Skew:                           0.115   Prob(JB):                        0.420
    Kurtosis:                       2.391   Cond. No.                         3.84
    ==============================================================================
    
    Warnings:
    [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
    """