代码之家  ›  专栏  ›  技术社区  ›  Rocketq

如何使用评估集设置学习xgboost?

  •  0
  • Rocketq  · 技术社区  · 6 年前

    当使用sklearn包装器时,这对我来说很容易做到:

    import xgboost as xgb
    clf = xgb.XGBClassifier( n_estimators=1500, learning_rate=0.015, gamma =0.3, min_child_weight = 3,nthread = 15,max_depth=150,
                            subsample=0.9, colsample_bytree=0.8, seed=2100,  eval_metric = "rmse")
    
    VALID = True
    if VALID == True:
        X_train, X_valid, y_train, y_valid = train_test_split(
            X, y, test_size = 0.19, random_state=23)
        model = xgb.train(X_train, y_train,  params,
                          evallist = [(X_valid, y_valid)], 
                          verbose_eval = 50, 
                early_stopping_rounds=50)
    

    但是,我无法使用xgboost的Standart类进行设置:

    params =   {
        'objective' : 'gpu:reg:linear',
        'learning_rate': 0.02, 
        'gamma' : 0.3, 
        'min_child_weight' : 3,
        'nthread' : 15,
        'max_depth' : 30,
        'subsample' : 0.9, 
        'colsample_bytree' : 0.8, 
        'seed':2100, 
        'eval_metric' : "rmse",
        'num_boost_round' : 300
    }
    
    VALID = True
    if VALID == True:
        X_train, X_valid, y_train, y_valid = train_test_split(
            X, y, test_size = 0.19, random_state=23)
        model = xgb.train(X_train, y_train,  params,
                          evallist = [(X_valid, y_valid)], 
                          verbose_eval = 50, 
                early_stopping_rounds=50)
    
    #error TypeError: train() got an unexpected keyword argument 'evallist'
    
    1 回复  |  直到 6 年前
        1
  •  0
  •   Rocketq    6 年前

    只需正确指定参数:

    params =   {
        #'objective' : 'gpu:reg:linear',
        'tree_method':'gpu_hist',
        'learning_rate': 0.02, 
        'gamma' : 0.3, 
        'min_child_weight' : 3,
        'nthread' : 15,
        'max_depth' : 30,
        'subsample' : 0.9, 
        'colsample_bytree' : 0.8, 
        'seed':2100, 
        'eval_metric' : "rmse",
        'num_boost_round' : 300,
        'n_estimators':999,
        'max_leaves': 300
    }
    
    VALID = True
    if VALID == True:
        X_train, X_valid, y_train, y_valid = train_test_split(
            X, y, test_size = 0.19, random_state=23)
    
        tr_data = xgb.DMatrix(X_train, y_train)
        va_data = xgb.DMatrix(X_valid, y_valid)
    
    
        #del X_train, X_valid, y_train, y_valid  ; gc.collect()
    
        watchlist = [(tr_data, 'train'), (va_data, 'valid')]
    
        model = xgb.train(params, tr_data, 300, watchlist, maximize=False, early_stopping_rounds = 30, verbose_eval=50)