代码之家  ›  专栏  ›  技术社区  ›  OneRaynyDay

ApacheMXnet-胶子和模块之间的转换(反之亦然)?

  •  1
  • OneRaynyDay  · 技术社区  · 7 年前

    我想知道如何在两个版本之间进行转换,因为量化功能似乎主要用于 syms, arg_params, aux_params

    下面是一个训练cnn模型的小代码片段:

    batch_size = 64
    num_inputs = 784
    num_outputs = 10
    data_iter = mx.io.NDArrayIter(x, y, batch_size=batch_size)
    
    num_fc = 512
    net = gluon.nn.HybridSequential()
    with net.name_scope():
        net.add(gluon.nn.Conv2D(channels=20, kernel_size=5, activation='relu'))
        net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))
        net.add(gluon.nn.Conv2D(channels=50, kernel_size=5, activation='relu'))
        net.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))
        net.add(gluon.nn.Flatten())
        net.add(gluon.nn.Dense(num_fc, activation="relu"))
        net.add(gluon.nn.Dense(num_outputs))
    
    net.hybridize()
    # Parameter initialization
    net.collect_params().initialize(mx.init.Xavier(magnitude=2.24), ctx=ctx)
    trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': .1})
    softmax_cross_entropy = gluon.loss.SoftmaxCrossEntropyLoss()
    for i, batch in enumerate(data_iter):
        data = batch.data[0].as_in_context(ctx)
        label = batch.label[0].as_in_context(ctx)
        with autograd.record():
            output = net(data)
            loss = softmax_cross_entropy(output, label)
        loss.backward()
        trainer.step(data.shape[0])
    

    如果我想量化胶子模型,我会尝试将胶子序列化到磁盘中,然后将其作为模块返回。这可能会导致故障:

    import os
    net.export('mxnet')
    mod = mx.module.Module.load('mxnet', 0) # 0 epoch
    

    mod.bind( data_shapes = data_iter.provide_data, 
              label_shapes = data_iter.provide_label)
    mod.predict(x)
    

    但它不适用于predict,使用以下stacktrace:

    ----------------------------------------------
    KeyError     Traceback (most recent call last)
    <ipython-input-10-f53137bb5e95> in <module>()
          1 mod.bind( data_shapes = data_iter.provide_data, 
    ----> 2           label_shapes = data_iter.provide_label)
          3 mod.predict(x)
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/module.py in bind(self, data_shapes, label_shapes, for_training, inputs_need_grad, force_rebind, shared_module, grad_req)
        434                                                      fixed_param_names=self._fixed_param_names,
        435                                                      grad_req=grad_req, group2ctxs=self._group2ctxs,
    --> 436                                                      state_names=self._state_names)
        437         self._total_exec_bytes = self._exec_group._total_exec_bytes
        438         if shared_module is not None:
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/executor_group.py in __init__(self, symbol, contexts, workload, data_shapes, label_shapes, param_names, for_training, inputs_need_grad, shared_group, logger, fixed_param_names, grad_req, state_names, group2ctxs)
        281 
        282         eprint(sys._getframe().f_lineno, data_shapes, label_shapes)
    --> 283         self.bind_exec(data_shapes, label_shapes, shared_group)
        284 
        285     def decide_slices(self, data_shapes):
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/executor_group.py in bind_exec(self, data_shapes, label_shapes, shared_group, reshape)
        388         if label_shapes is not None:
        389             self.label_names = [i.name for i in self.label_shapes]
    --> 390         self._collect_arrays()
        391 
        392     def reshape(self, data_shapes, label_shapes):
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/executor_group.py in _collect_arrays(self)
        324             self.label_arrays = [[(self.slices[i], e.arg_dict[name])
        325                                   for i, e in enumerate(self.execs)]
    --> 326                                  for name, _ in self.label_shapes]
        327         else:
        328             self.label_arrays = None
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/executor_group.py in <listcomp>(.0)
        324             self.label_arrays = [[(self.slices[i], e.arg_dict[name])
        325                                   for i, e in enumerate(self.execs)]
    --> 326                                  for name, _ in self.label_shapes]
        327         else:
        328             self.label_arrays = None
    
    ~/anaconda3/envs/idp3/lib/python3.6/site-packages/mxnet/module/executor_group.py in <listcomp>(.0)
        323                 eprint(323, e.arg_dict.keys())
        324             self.label_arrays = [[(self.slices[i], e.arg_dict[name])
    --> 325                                   for i, e in enumerate(self.execs)]
        326                                  for name, _ in self.label_shapes]
        327         else:
    
    KeyError: 'softmax_label'
    

    它在抱怨我在我的 e.arg_dict .

    e、 盘符 :

    (['data', 'hybridsequential1_conv0_weight', 'hybridsequential1_conv0_bias', 'hybridsequential1_conv1_weight', 'hybridsequential1_conv1_bias', 'hybridsequential1_dense0_weight', 'hybridsequential1_dense0_bias', 'hybridsequential1_dense1_weight', 'hybridsequential1_dense1_bias'])

    事实上, softmax_label 不在里面。这个标签是从哪里来的,我如何才能转换模块正确胶子?

    1 回复  |  直到 7 年前
        1
  •  1
  •   sad-    7 年前

    对于你问题的第一部分(标签是从哪里来的?):

    默认情况下,当您添加 label_shapes = data_iter.provide_label arg到 mod.bind 打电话来。您可以通过显式设置 label_shapes = None 相反。 看答案 https://discuss.mxnet.io/t/gluon-module-what-is-label-name-and-why-do-i-need-labels-for-modules-to-run-bind/1433

    对于问题的第二部分(如何正确地将模块转换为胶子模型?):

    要将符号模型转换为胶子模型,可以

    • 使用将符号模型保存到磁盘 mod.save_checkpoint mod.save_params
    • net
    • 使用加载参数 net.load_params(filename, ctx=ctx)

    例如:

    mod.save_params('mxnet.params')
    net2 = gluon.nn.HybridSequential()
    with net2.name_scope():
        net2.add(gluon.nn.Conv2D(channels=20, kernel_size=5, activation='relu'))
        net2.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))
        net2.add(gluon.nn.Conv2D(channels=50, kernel_size=5, activation='relu'))
        net2.add(gluon.nn.MaxPool2D(pool_size=2, strides=2))
        net2.add(gluon.nn.Flatten())
        net2.add(gluon.nn.Dense(num_fc, activation="relu"))
        net2.add(gluon.nn.Dense(num_outputs))
    
    net2.hybridize()
    net2.load_params('mxnet.params', ctx=ctx)
    
    推荐文章