代码之家  ›  专栏  ›  技术社区  ›  Colonder

使用数据集API创建图像数据集时,出现“typeerror:expected bytes,str found”

  •  0
  • Colonder  · 技术社区  · 7 年前

    我想使用DataSetAPI从我的图像中创建一个TensorFlow的数据集。这些图像被组织成一个复杂的层次结构,但在最后,总是有两个目录“假”和“真”。我写了这段代码

    import tensorflow as tf
    from tensorflow.data import Dataset
    import os
    
    def enumerate_all_files(rootdir):
        for subdir, dir, files in os.walk(rootdir):
            for file in files:
                # return path to the file and its label
                # label is simply a 1 or 0 depending on whether an image is in the "Genuine" folder or not
                yield os.path.join(subdir, file), int(subdir.split(os.path.sep)[-1] == "Genuine")
    
    def input_parser(img_path, label):
        # convert the label to one-hot encoding
        one_hot = tf.one_hot(label, 2)
        # read the img from file
        img_file = tf.read_file(img_path)
        img_decoded = tf.image.decode_png(img_file, channels=3)
        return img_decoded, one_hot
    
    def get_dataset():
        generator = lambda: enumerate_all_files("/tmp/images/training/")
        dataset = Dataset.from_generator(generator, (tf.string, tf.int32)).shuffle(1000).batch(100)
        dataset = dataset.map(input_parser)
        return dataset
    

    但是,当我在终端运行它时,

    tf.enable_eager_execution()
    # all the code above
    d = get_dataset()
    for f in d.make_one_shot_iterator():
        print(f)
    

    它因出错而崩溃

    W tensorflow/core/framework/op_kernel.cc:1306] Unknown: SystemError: <weakref at 0x7ff8232f0620; to 'function' at 0x7ff8233c9048 (generator_py_func)> returned a result with an error set
    TypeError: expected bytes, str found  
    
    The above exception was the direct cause of the following exception:
    
    Traceback (most recent call last):
      File "lcnn.py", line 29, in <module>
        for f in d.make_one_shot_iterator():
      File "/opt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 487, in __next__
        return self.next()
      File "/opt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 518, in next
        return self._next_internal()
      File "/opt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/data/ops/iterator_ops.py", line 508, in _next_internal
        output_shapes=self._flat_output_shapes)
      File "/opt/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 1848, in iterator_get_next_sync
        "output_types", output_types, "output_shapes", output_shapes)
    SystemError: <built-in function TFE_Py_FastPathExecute> returned a result with an error set
    

    我在这里做错什么了?

    编辑
    我试着不打电话运行代码 map , shuffle batch 以及评论 input_parser 但还是出现了错误。

    编辑2
    我变了 Dataset.from_generator Dataset.from_tensor_slices 看看打开图片的代码是否有效。所以更改后的代码看起来像

    def input_parser(img_path):
        # convert the label to one-hot encoding
        # one_hot = tf.one_hot(label, 2)
        # read the img from file
        img_file = tf.read_file(img_path)
        img_decoded = tf.image.decode_png(img_file, channels=3)
        return img_decoded
    
    def get_dataset():
        dataset = Dataset.from_tensor_slices(["/tmp/images/training/1000010.png"]).map(input_parser).shuffle(1000).batch(100)
        return dataset  
    

    不过这个很好用

    2 回复  |  直到 7 年前
        1
  •  0
  •   sdcbr    7 年前

    def get_dataset():
        generator = lambda: enumerate_all_files("/tmp/images/training/")
        dataset = Dataset.from_generator(generator, (tf.string,tf.int32)).map(input_parser)
        dataset = dataset.shuffle(1000).batch(100)
        return dataset
    
        2
  •  0
  •   Colonder    7 年前

    Dataset.from_generator Dataset.from_tensor_slices

    import tensorflow as tf
    from tensorflow.data import Dataset
    import os
    tf.enable_eager_execution()
    
    def enumerate_all_files(rootdir):
        for subdir, dir, files in os.walk(rootdir):
            for file in files:
                # return path to the file and its label
                # label is simply a 1 or 0 depending on whether an image is in the "Genuine" folder or not
                yield os.path.join(subdir, file), int(subdir.split(os.path.sep)[-1] == "Genuine")
    
    def input_parser(img_path, label):
        # convert the label to one-hot encoding
        one_hot = tf.one_hot(label, 2)
        # read the img from file
        img_file = tf.read_file(img_path)
        img_decoded = tf.image.decode_png(img_file, channels=3)
        return img_decoded, one_hot
    
    def get_dataset():
    
        file_paths = []
        labels = []
    
        for i in enumerate_all_files("/media/kuba/Seagate Expansion Drive/MGR/Spektrogramy/FFT/training/"):
            file_paths.append(i[0])
            labels.append(i[1])
        dataset = Dataset.from_tensor_slices((file_paths, labels)).map(input_parser).shuffle(1000).batch(100)
        return dataset
    
    d = get_dataset()
    for f in d.make_one_shot_iterator():
        print(type(f))
    
    推荐文章