代码之家  ›  专栏  ›  技术社区  ›  Filipe Ferminiano

ValueError:多类因变量不支持未知

  •  0
  • Filipe Ferminiano  · 技术社区  · 8 年前

    我试图在sklearn中拟合向量,但我收到了以下错误:

        X = df_features.values
        X = X.reshape((len(X),len(df_features.columns)))
        Y = df_train['action'].values
        Y = Y.reshape((len(Y),))
    
    pipeline = Pipeline([
     ('clf', RandomForestClassifier())
    ])
    
    parameters = {
        'clf__max_depth': [5,7,9],
        'clf__max_features': [3,4,5],
        'clf__min_samples_leaf': [3,4,5,6,7],
        'clf__bootstrap': [True]
    }
    
    score_func = make_scorer(metrics.f1_score,average='weighted')
    
    grid_search = GridSearchCV(pipeline, parameters, n_jobs=3,
      verbose=1, scoring=score_func)
    
    grid_search.fit(X, Y)
    

    这是Y样本数据:

    [“NOTHING”,“NOTHING”,“SELL”,“SELL”,“NOTHING”,

    我怎样才能解决这个问题?
    谢谢

    1 回复  |  直到 8 年前
        1
  •  0
  •   seralouk    8 年前

    请检查x和y的类型和尺寸。此外,您是否有足够的样本用于所需的最大深度和最小样本叶?

    from sklearn.pipeline import Pipeline
    from sklearn.model_selection import GridSearchCV
    from sklearn.metrics import fbeta_score, make_scorer
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.datasets import load_iris 
    import numpy as np
    from sklearn import metrics
    from sklearn.model_selection import LeaveOneOut
    
    
    loo= LeaveOneOut()
    data = load_iris()
    
    x = data.data
    x = x[0:14,:]
    x.shape
    
    y = ['NOTHING', 'NOTHING', 'SELL', 'SELL', 'NOTHING', 'NOTHING','SELL','SELL','NOTHING','SELL','SELL','NOTHING','NOTHING','NOTHING']
    y = np.asarray(y)
    y = y.reshape(14,1)
    y = y.astype('str')
    
    
    pipeline = Pipeline( [ ('clf', RandomForestClassifier() )] )
    
    parameters = {'clf__max_depth': [1,2,3], 'clf__max_features': [1,2,3], 'clf__min_samples_leaf': [1,2,3], 'clf__bootstrap': [True] }
    
    score_func = make_scorer(metrics.f1_score,average='weighted')
    
    grid_search = GridSearchCV(pipeline, parameters, n_jobs=1 , verbose=1, scoring=score_func, cv = loo)
    
    grid_search.fit(x, y)
    

    后果

    Fitting 14 folds for each of 45 candidates, totalling 630 fits
    [Parallel(n_jobs=1)]: Done 630 out of 630 | elapsed:   33.7s finished
    

    希望这有帮助