代码之家  ›  专栏  ›  技术社区  ›  mathjunkie

Tensorflow渐变为0,权重未更新

  •  0
  • mathjunkie  · 技术社区  · 8 年前

    在使用Keras一段时间后,我正在尝试学习TensorFlow,我正在尝试为CIFAR-10分类构建一个ConvNet。然而,我想我误解了TensorFlow API中的某些内容,因为即使在1层网络模型中,权重也不会更新。

    模型代码如下:

    num_epochs = 10 
    batch_size = 64
    
    # Shape of mu and std is correct: (1, 32, 32, 3)
    mu = np.mean(X_train, axis=0, keepdims=True)
    sigma = np.std(X_train, axis=0, keepdims=True)
    
    # Placeholders for data & normalization
    # (normalisation does not help)
    data = tf.placeholder(np.float32, shape=(None, 32, 32, 3), name='data')
    labels = tf.placeholder(np.int32, shape=(None,), name='labels')
    data = (data - mu) / sigma
    
    # flatten
    flat = tf.reshape(data, shape=(-1, 32 * 32 * 3))
    dense1 = tf.layers.dense(inputs=flat, units=10)
    predictions = tf.nn.softmax(dense1)
    
    onehot_labels = tf.one_hot(indices=labels, depth=10)
    
    # Tried sparse_softmax_cross_entropy_with_logits as well
    loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=predictions)
    loss = tf.reduce_mean(loss)
    
    # Learning rate does not matter as the weights are not updating!
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
    loss_history = []
    
    with tf.Session() as session:
        tf.global_variables_initializer().run()
        tf.local_variables_initializer().run()
    
        for epochs in range(10):
            print("Epoch:", epochs)
            # Load tiny batches-
            for batch in iterate_minibatches(X_train.astype(np.float32)[:10], y_train[:10], 5):
                inputs, target = batch
                feed_dict = {data: inputs, labels: target}
                loss_val, _ = session.run([loss, optimizer], feed_dict=feed_dict)
                grads = tf.reduce_sum(tf.gradients(loss, dense1)[0])
                grads = session.run(grads, {data: inputs, labels: target})
                print("Loss:", loss_val, "Grads:", grads)
    

    该代码生成以下输出:

    Epoch: 0
    Loss: 2.46115 Grads: -1.02031e-17
    Loss: 2.46041 Grads: 0.0
    Epoch: 1
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 2
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 3
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 4
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 5
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 6
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 7
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 8
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    Epoch: 9
    Loss: 2.46115 Grads: 0.0
    Loss: 2.26115 Grads: 0.0
    

    看起来像模型 可能会重置 它的重量不知何故或完全停止了学习。 我也尝试过稀疏softmax交叉熵损失,但没有任何帮助。

    1 回复  |  直到 8 年前
        1
  •  2
  •   Dr. Snoopy    8 年前

    对输出应用softmax两次,每次 tf.nn.softmax 当你申请的时候 softmax_cross_entropy . 这可能会破坏网络中的任何学习能力。