代码之家  ›  专栏  ›  技术社区  ›  Konstantin Vdovkin

Tensorflow-添加脱落层显著增加推理时间

  •  0
  • Konstantin Vdovkin  · 技术社区  · 5 年前

    我有相对较小的CNN

    model = tf.keras.models.Sequential([
        tf.keras.layers.Conv2D(input_shape=(400,400,3), filters=6, kernel_size=5, padding='same', activation='relu'),
        tf.keras.layers.Conv2D(filters=12, kernel_size=3, padding='same', activation='relu'),
        tf.keras.layers.Conv2D(filters=24, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Conv2D(filters=32, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Conv2D(filters=48, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Conv2D(filters=64, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Conv2D(filters=96, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Conv2D(filters=128, kernel_size=3, strides=2, padding='valid', activation='relu'),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dense(240, activation='softmax')
    ])
    model.compile(optimizer='adam', loss='categorical_crossentropy')
    

    for img_per_batch in [1, 5, 10, 50]:
        # warm up the model
        image = np.random.random(size=(img_per_batch, 400, 400, 3)).astype('float32')
        model(image, training=False)
    
        n_iter = 100
        start_time = time.time()
        for _ in range(n_iter):
            image = np.random.random(size=(img_per_batch, 400, 400, 3)).astype('float32')
            model(image, training=False)
        dt = (time.time() - start_time) * 1000
        print(f'img_per_batch = {img_per_batch}, {dt/n_iter:.2f} ms per iteration, {dt/n_iter/img_per_batch:.2f} ms per image')
    

    我的输出(Nvidia Jetson Xavier,tensorflow==2.0.0):

    img_per_batch = 1, 21.74 ms per iteration, 21.74 ms per image
    img_per_batch = 5, 42.35 ms per iteration, 8.47 ms per image
    img_per_batch = 10, 68.37 ms per iteration, 6.84 ms per image
    img_per_batch = 50, 312.83 ms per iteration, 6.26 ms per image
    

    model = tf.keras.models.Sequential([
        # ... convolution layers are same
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dropout(.3),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dropout(.3),
        tf.keras.layers.Dense(256, activation='relu'),
        tf.keras.layers.Dropout(.3),
        tf.keras.layers.Dense(240, activation='softmax')
    ])
    

    添加图层后,输出变为如下:

    img_per_batch = 1, 31.18 ms per iteration, 31.18 ms per image
    img_per_batch = 5, 76.15 ms per iteration, 15.23 ms per image
    img_per_batch = 10, 127.91 ms per iteration, 12.79 ms per image
    img_per_batch = 50, 513.85 ms per iteration, 10.28 ms per image
    

    理论上,退出层不应该影响推理性能。但在上述代码中,增加漏失层使单帧图像的预测时间增加了1.5倍,10帧图像的批预测速度几乎是不漏失层的两倍。我做错什么了吗?

    0 回复  |  直到 5 年前
        1
  •  5
  •   Daniele Grattarola    5 年前

    显然,这是TensorFlow 2.0.0中的已知问题: see this GitHub comment

    model.predict(x) 而不是 model(x)

    这也可以通过更新到更新版本的TensorFlow(如2.1.0)来解决。

    希望这有帮助