代码之家  ›  专栏  ›  技术社区  ›  Schnurrberto

回归练习中的巨大错误

  •  0
  • Schnurrberto  · 技术社区  · 8 年前

    我目前正试图了解如何借助Tensorflow解决回归问题。不幸的是,当我尝试为输入数据引入第二维度时,错误或损失就变得非常大。

    X1 = [2.167,3.1,3.3,4.168,4.4,5.313,5.5,5.654,6.182,6.71,6.93,7.042,7.59,7.997,9.27,9.779,10.791]
    X2 = [3.167,4.1,4.3,5.168,5.4,6.313,6.5,6.654,7.182,7.71,7.93,8.042,8.59,8.997,10.27,10.779,11.791]
    y = [1.221,1.3,1.573,1.65,1.694,1.7,2.09,2.42,2.53,2.596,2.76,2.827,2.904,2.94,3.19,3.366,3.465]
    

    我尝试使用线性回归近似值:

    numbers = pd.DataFrame({'x1': X1, 'x2':X2})
    
    X_train, X_test, y_train, y_test = train_test_split(numbers,y,test_size=0.3,random_state=101)
    
    X_data = tf.placeholder(shape=[None,2], dtype=tf.float32)
    y_target = tf.placeholder(shape=[None], dtype=tf.float32)
    
    w1 = tf.Variable(tf.random_normal(shape=[2,1])) 
    b1 = tf.Variable(tf.random_normal(shape=[1]))
    
    final_output = tf.add(tf.matmul(X_data, w1), b1)
    
    loss = tf.reduce_sum(tf.square(final_output-y_target))
    
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
    train = optimizer.minimize(loss)
    
    init = tf.global_variables_initializer()
    
    steps = 5000
    
    with tf.Session() as sess:
    
        sess.run(init)
    
        for i in range(steps):
    
            sess.run(train,feed_dict={X_data:X_train,y_target:y_train})
    
            # PRINT OUT A MESSAGE EVERY 100 STEPS
            if i%500 == 0:
    
                print('Currently on step {}'.format(i))
    
                training_cost = sess.run(loss, feed_dict={X_data:X_test,y_target:y_test})
                print("Training cost=", training_cost/5)
    
        training_cost = sess.run(loss, feed_dict={X_data:X_test,y_target:y_test})
        print("Training cost=", training_cost)
    

    这给了我输出

    Currently on step 0
    Training cost= 12376958566.4
    Currently on step 500
    Training cost= nan
    Currently on step 1000
    Training cost= nan
    Currently on step 1500
    Training cost= nan
    Currently on step 2000
    Training cost= nan
    Currently on step 2500
    Training cost= nan
    Currently on step 3000
    Training cost= nan
    Currently on step 3500
    Training cost= nan
    Currently on step 4000
    Training cost= nan
    Currently on step 4500
    Training cost= nan
    Training cost= nan
    

    我用Adagrad优化器得到了更好的结果,它给了我5的误差,但我仍然认为应该还有更多的可能性。

    这里是否可以添加隐藏层?我以前也尝试过这个,但在层中使用relu作为激活函数时 f(x)=x

    1 回复  |  直到 8 年前
        1
  •  0
  •   Stephen    8 年前

    有两个问题。首先,你正在使用 tf.reduce_sum tf.reduce_mean ,因此,您拥有的数据越多,您的损失就越大。这向 GradientDescentOptimizer ,因此,学习的权重会产生巨大的跳跃,并且您的模型会发散。

    learning_rate=0.001 使用您的代码。