keras预处理中的random_shift和flow_from_directory

时间:2018-06-02 09:12:48

标签: python keras

我有两个问题,两者都与preprocessing.py

中的image.py有关

1)图像偏移使用替代维度来决定tx和ty。 为什么tx决定高度的变化。它不应该在宽度上工作吗?

   h, w = x.shape[row_axis], x.shape[col_axis]
     tx = np.random.uniform(-hrg, hrg) * h
     ty = np.random.uniform(-wrg, wrg) * w
     x = apply_affine_transform(x, tx=tx, ty=ty, channel_axis=channel_axis,
    fill_mode=fill_mode, cval=cval)

2)我尝试使用两种不同的方法来提取flower17数据集上的瓶颈特征。共有17个类,每个标签有80个图像。第一种方法加载每个文件并计算瓶颈功能。

 for i, label in enumerate(train_labels):
        cur_path = train_path + "/" + label
        count = 1
        for image_path in glob.glob(cur_path + "/*.jpg"):
            img = image.load_img(image_path, target_size=image_size)
            x = image.img_to_array(img)
            x = np.expand_dims(x, axis=0)
            x = preprocess_input(x)
            feature = model.predict(x)
            flat = feature.flatten()
            features.append(flat)
            labels.append(label)
            print ("[INFO] processed - " + str(count))
        count += 1
    print ("[INFO] completed label - " + label)

第二个使用ImageDataGenerator和flow_from_directory

train_datagen = ImageDataGenerator(preprocessing_function = preprocess_input)  
generator  = train_datagen.flow_from_directory(
            train_path,
            target_size=image_size,
            shuffle = "false",
            class_mode='categorical',
            batch_size=batchsize)

nb_train_samples = len(generator.filenames)  
num_classes = len(generator.class_indices) 
predict_size_train = int(math.ceil(nb_train_samples / batchsize))
label_map = (generator.class_indices)
label_map = dict((v,k) for k,v in label_map.items()) #flip 
for i in range(predict_size_train):
    x, y = next(generator)
    features.append(model.predict(x))
    labels.append(y)
features = np.concatenate(features)

labels = np.argmax(np.concatenate(labels),axis=1)
labels = [label_map[x] for x in labels]

上述两种方法都只使用preprocess_input。我注意到这两种方法中每个类都有80个标签,但是一旦我将这些行特征与这个函数进行比较;我发现大约不到50%的功能匹配。

def compareFeatures(A, B):
    nrows, ncols = A.shape
    dtype={'names':['f{}'.format(i) for i in range(ncols)],
       'formats':ncols * [A.dtype]}

    C = np.intersect1d(A.view(dtype), B.view(dtype))

    C = C.view(A.dtype).reshape(-1, ncols)
    return C

两种方法的分类结果差异达到2-3%。但是这两种方法都使用load_img和img_to_array,所以不应该有任何区别

0 个答案:

没有答案