代码之家 › 专栏 › 技术社区 › mathjunkie

Tensorflow渐变为0,权重未更新

deep-learning computer-vision machine-learning tensorflow python

mathjunkie · 技术社区 · 8 年前

在使用Keras一段时间后,我正在尝试学习TensorFlow,我正在尝试为CIFAR-10分类构建一个ConvNet。然而,我想我误解了TensorFlow API中的某些内容,因为即使在1层网络模型中,权重也不会更新。

模型代码如下:

num_epochs = 10 
batch_size = 64

# Shape of mu and std is correct: (1, 32, 32, 3)
mu = np.mean(X_train, axis=0, keepdims=True)
sigma = np.std(X_train, axis=0, keepdims=True)

# Placeholders for data & normalization
# (normalisation does not help)
data = tf.placeholder(np.float32, shape=(None, 32, 32, 3), name='data')
labels = tf.placeholder(np.int32, shape=(None,), name='labels')
data = (data - mu) / sigma

# flatten
flat = tf.reshape(data, shape=(-1, 32 * 32 * 3))
dense1 = tf.layers.dense(inputs=flat, units=10)
predictions = tf.nn.softmax(dense1)

onehot_labels = tf.one_hot(indices=labels, depth=10)

# Tried sparse_softmax_cross_entropy_with_logits as well
loss = tf.losses.softmax_cross_entropy(onehot_labels=onehot_labels, logits=predictions)
loss = tf.reduce_mean(loss)

# Learning rate does not matter as the weights are not updating!
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(loss)
loss_history = []

with tf.Session() as session:
    tf.global_variables_initializer().run()
    tf.local_variables_initializer().run()

    for epochs in range(10):
        print("Epoch:", epochs)
        # Load tiny batches-
        for batch in iterate_minibatches(X_train.astype(np.float32)[:10], y_train[:10], 5):
            inputs, target = batch
            feed_dict = {data: inputs, labels: target}
            loss_val, _ = session.run([loss, optimizer], feed_dict=feed_dict)
            grads = tf.reduce_sum(tf.gradients(loss, dense1)[0])
            grads = session.run(grads, {data: inputs, labels: target})
            print("Loss:", loss_val, "Grads:", grads)

该代码生成以下输出:

Epoch: 0
Loss: 2.46115 Grads: -1.02031e-17
Loss: 2.46041 Grads: 0.0
Epoch: 1
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 2
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 3
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 4
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 5
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 6
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 7
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 8
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0
Epoch: 9
Loss: 2.46115 Grads: 0.0
Loss: 2.26115 Grads: 0.0

看起来像模型 可能会重置 它的重量不知何故或完全停止了学习。我也尝试过稀疏softmax交叉熵损失,但没有任何帮助。

1 回复 | 直到 8 年前

Dr. Snoopy 8 年前

对输出应用softmax两次,每次 tf.nn.softmax 当你申请的时候 softmax_cross_entropy . 这可能会破坏网络中的任何学习能力。

推荐文章

anfas2 · 如何使用MediaPipe在Python中检测到的地标上叠加自定义形状?

1 年前

Hui Liu · 为什么在透视投影过程中需要使用齐次坐标作为相机/世界坐标?

1 年前

Jaime Manuel Garcia Dominguez · 为什么图像结果翻转了90度?

2 年前

FD22EC008 · 每当我试图在我的Smowcode ide上上传代码时;它不断上传

2 年前

samuelkaris · 如何将实时计算机视觉代码集成到django中

2 年前

Flush · 如何在计算机视觉的多类分类任务中分割数据集?

2 年前

Extra_Caterpillar · 我面临着使用白蛋白增强图像的问题

2 年前

euraad · Ballard和Guil在广义Hough变换中有什么区别?[已关闭]

2 年前

Christian Tan · 如何在react native中将jpg图像转换为张量?

2 年前

fampkin · 下载Yolov8.Net时出现问题

2 年前