代码之家  ›  专栏  ›  技术社区  ›  Susmit Agrawal Sudip Bolakhe

在TensorFlow中实现Keras模型的问题

  •  1
  • Susmit Agrawal Sudip Bolakhe  · 技术社区  · 6 年前

    我刚开始 Tensorflow .

    我尝试实现一个模型来对mnsit数据集中的数字进行分类。

    我熟悉 Keras ,所以我首先使用它来创建模型。

    科拉斯代码:

    from keras.models import Sequential
    from keras.layers import Dense
    from keras.datasets import mnist
    from os import path
    
    import numpy as np
    
    network = Sequential()
    network.add(Dense(700, input_dim=784, activation='tanh'))
    network.add(Dense(500, activation='tanh'))
    network.add(Dense(500, activation='tanh'))
    network.add(Dense(500, activation='tanh'))
    network.add(Dense(10, activation='softmax'))
    
    network.compile(loss='categorical_crossentropy', optimizer='adam')
    
    (x_train, y_temp), (x_test, y_test) = mnist.load_data()
    y_train = vectorize(y_temp)  # I defined this function to create vectors of the labels. It works without issues.
    
    x_train = x_train.reshape(x_train.shape[0], x_train.shape[1]*x_train.shape[2])
    
    network.fit(x_train, y_train, batch_size=100, epochs=3)
    
    x_test = x_test.reshape(x_test.shape[0], x_test.shape[1]*x_test.shape[2])
    
    
    scores = network.predict(x_test)
    
    correct_pred = 0
    for i in range(len(scores)):
        if np.argmax(scores[i]) == y_test[i]:
            correct_pred += 1
    
    print((correct_pred/len(scores))*100)
    

    上面的代码给了我大约92%的准确度。

    我尝试在TensorFlow中实现相同的模型:

    import sys
    
    import tensorflow as tf
    from tensorflow.examples.tutorials.mnist import input_data
    
    data = input_data.read_data_sets('.', one_hot=True)
    
    sess = tf.InteractiveSession()
    
    x = tf.placeholder(tf.float32, [None, 784])
    y = tf.placeholder(tf.float32, [None, 10])
    
    w = tf.Variable(tf.zeros([784, 700]))
    w2 = tf.Variable(tf.zeros([700, 500]))
    w3 = tf.Variable(tf.zeros([500, 500]))
    w4 = tf.Variable(tf.zeros([500, 500]))
    w5 = tf.Variable(tf.zeros([500, 10]))
    
    h1 = tf.nn.tanh(tf.matmul(x, w))
    h2 = tf.nn.tanh(tf.matmul(h1, w2))
    h3 = tf.nn.tanh(tf.matmul(h2, w3))
    h4 = tf.nn.tanh(tf.matmul(h3, w4))
    h = tf.matmul(h4, w5)
    
    loss = tf.math.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=h, labels=y))
    gradient_descent = tf.train.AdamOptimizer().minimize(loss)
    
    correct_mask = tf.equal(tf.argmax(h, 1), tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_mask, tf.float32))
    
    sess.run(tf.global_variables_initializer())
    
    for i in range(3):
        batch_x, batch_y = data.train.next_batch(100)
        loss_print = tf.print(loss, output_stream=sys.stdout)
        sess.run([gradient_descent, loss_print], feed_dict={x: batch_x, y: batch_y})
    
    ans = sess.run(accuracy, feed_dict={x: data.test.images, y: data.test.labels})
    
    print(ans)
    

    然而,这段代码只给了我11%左右的准确度。 我试着把年代增加到1000年,但结果没有改变。此外,每个时代的损失是相同的(2.30)。

    我是否在TensorFlow代码中缺少某些内容?

    1 回复  |  直到 6 年前
        1
  •  1
  •   Susmit Agrawal Sudip Bolakhe    6 年前

    结果,问题是我把权重初始化为零!

    简单改变

    w = tf.Variable(tf.zeros([784, 700]))
    w2 = tf.Variable(tf.zeros([700, 500]))
    w3 = tf.Variable(tf.zeros([500, 500]))
    w4 = tf.Variable(tf.zeros([500, 500]))
    w5 = tf.Variable(tf.zeros([500, 10]))
    

    w = tf.Variable(tf.random_normal([784, 700], seed=42))
    w2 = tf.Variable(tf.random_normal([700, 500], seed=42))
    w3 = tf.Variable(tf.random_normal([500, 500], seed=42))
    w4 = tf.Variable(tf.random_normal([500, 500], seed=42))
    w5 = tf.Variable(tf.random_normal([500, 10], seed=42))
    

    做了很大的改进。