Question

我有一个TFRecords文件，其中包含带有标签，名称，大小等的图像。我的目标是将标签和图像提取为numpy数组。

我执行以下操作来加载文件：

def extract_fn(data_record):
    features = {
        # Extract features using the keys set during creation
        "image/class/label":    tf.FixedLenFeature([], tf.int64),
        "image/encoded":        tf.VarLenFeature(tf.string),
    }
    sample = tf.parse_single_example(data_record, features)
    #sample = tf.cast(sample["image/encoded"], tf.float32)
    return sample

filename = "path\train-00-of-10"
dataset = tf.data.TFRecordDataset(filename)
dataset = dataset.map(extract_fn)
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()

with tf.Session() as sess:
    while True:
        data_record = sess.run(next_element)
        print(data_record)

图像保存为字符串。如何将图像转换为float32？我尝试了sample = tf.cast(sample["image/encoded"], tf.float32)，但该方法无效。我希望data_record是一个列表，其中包含作为numpy数组的图像和作为np.int32数字的标签。我该怎么办？

现在data_record看起来像这样：

{'image/encoded': SparseTensorValue(indices=array([[0]]), values=array([b'\xff\xd8\ ... 8G\xff\xd9'], dtype=object), dense_shape=array([1])), 'image/class/label': 394}

我不知道该如何处理。我将不胜感激

编辑

如果我在sample中打印sample['image/encoded']和extract_fn()，则会得到以下信息：

print(sample) = {'image/encoded': <tensorflow.python.framework.sparse_tensor.SparseTensor object at 0x7fe41ec15978>, 'image/class/label': <tf.Tensor 'ParseSingleExample/ParseSingleExample:3' shape=() dtype=int64>}

print(sample['image/encoded'] = SparseTensor(indices=Tensor("ParseSingleExample/ParseSingleExample:0", shape=(?, 1), dtype=int64), values=Tensor("ParseSingleExample/ParseSingleExample:1", shape=(?,), dtype=string), dense_shape=Tensor("ParseSingleExample/ParseSingleExample:2", shape=(1,), dtype=int64))

图像似乎是稀疏张量，tf.image.decode_image引发错误。将图像提取为tf.float32张量的正确方法是什么？

Answer 1

我相信您存储的图像编码为JPEG或PNG或其他格式。因此，在阅读时，您必须对它们进行解码：

def extract_fn(data_record):
    features = {
        # Extract features using the keys set during creation
        "image/class/label":    tf.FixedLenFeature([], tf.int64),
        "image/encoded":        tf.VarLenFeature(tf.string),
    }
    sample = tf.parse_single_example(data_record, features)
    image = tf.image.decode_image(sample['image/encoded'], dtype=tf.float32) 
    label = sample['image/class/label']
    return image, label

...

with tf.Session() as sess:
    while True:
        image, label = sess.run(next_element)
        image = image.reshape(IMAGE_SHAPE)

更新：看来您将数据作为稀疏Tensor中的单个单元格值获取了。尝试将其转换回密集状态，并在解码前后进行检查：

def extract_fn(data_record):
    features = {
        # Extract features using the keys set during creation
        "image/class/label":    tf.FixedLenFeature([], tf.int64),
        "image/encoded":        tf.VarLenFeature(tf.string),
    }
    sample = tf.parse_single_example(data_record, features)
    label = sample['image/class/label']
    dense = tf.sparse_tensor_to_dense(sample['image/encoded'])

    # Comment it if you got an error and inspect just dense:
    image = tf.image.decode_image(dense, dtype=tf.float32) 

    return dense, image, label

Answer 2

如果我的观点正确，您只需要hostname/IP total used free shared buff/cache available Mem:

numpy.fromstring

或者，如果您需要张量流函数，则需要在解析函数中添加img_str = (example.features.feature['image_raw'].bytes_list.value) image = np.fromstring(img_str, dtype=np.float32)，我认为最好将图像和标签分开

tf.decode_raw

Tensorflow：从TFRecords文件中提取图像和标签

2 个答案: