Question

我使用TensorFlow 1.9来训练带有一些玩具数据的简单神经网络，以尝试了解TensorFlow如何在GPU中分配内存。我的显卡是NVIDIA GeForce GTX 780 Ti，它具有3GB的GPU内存。

在我的代码中，我创建了数据并设置了批处理大小，以使一个批处理占用的内存量为4GB。通过打印出包含此数据的NumPy数组的字节数，可以在代码中对此进行验证。

运行此代码时，我收到以下警告消息，该消息在每批中打印3次：

2018-08-03 14:24:50.021264: W tensorflow/core/framework/allocator.cc:108] Allocation of 4000000000 exceeds 10% of system memory.

对此，我有两个问题：

1）此警告消息是什么意思？占内存的10％？ GPU内存？

2）如何将一批4GB的容量放入3GB的GPU中？。批处理是否分为子批处理，并且每个批处理都通过GPU独立发送？

如果有兴趣，那么我的完整代码如下：

# Python imports
import numpy as np

# Tensorflow imports
import tensorflow as tf


# Set some parameters
np.random.seed(0)
num_examples = 2000000
input_size = 1000
num_training_examples = int(0.8 * num_examples)
num_validation_examples = int(0.2 * num_examples)
batch_size = 1000000

# Create the data
print('Creating data')
input_data = np.random.rand(num_examples, input_size).astype(np.float32)
label_data = np.random.rand(num_examples, 1).astype(np.float32)
training_input_data = input_data[:num_training_examples]
training_label_data = label_data[:num_training_examples]
validation_input_data = input_data[num_training_examples:]
validation_label_data = label_data[num_training_examples:]
print('Data created')

# Get the memory for the data
data_memory = training_input_data.nbytes + training_label_data.nbytes + validation_input_data.nbytes + validation_label_data.nbytes
data_memory /= 1e6
print('Dataset memory = ' + str(data_memory) + ' MB')
example_memory = training_input_data[0].nbytes + training_label_data[0].nbytes
batch_memory = example_memory * batch_size
batch_memory /= 1e6
print('Batch memory = ' + str(batch_memory) + ' MB')

# Create the placeholders
input_placeholder = tf.placeholder(dtype=np.float32, shape=[None, input_size])
label_placeholder = tf.placeholder(dtype=np.float32, shape=[None, 1])

# Create the network
x = tf.layers.dense(inputs=input_placeholder, units=input_size, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
x = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu)
predictions = tf.layers.dense(inputs=x, units=1)

# Define the loss
loss_op = tf.reduce_mean(tf.square(label_placeholder - predictions))

# Define the optimiser
train_op = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss_op)

# Run a TensorFlow session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Loop over epochs
    num_training_batches = int(num_training_examples / batch_size)
    num_validation_batches = int(num_validation_examples / batch_size)
    training_losses = []
    validation_losses = []
    for epoch_num in range(1000):
        print('epoch ' + str(epoch_num))

        # Training
        batch_loss_sum = 0
        for batch_num in range(num_training_batches):
            print('batch ' + str(batch_num))
            batch_inputs = training_input_data[batch_num * batch_size: (batch_num + 1) * batch_size]
            batch_labels = training_label_data[batch_num * batch_size: (batch_num + 1) * batch_size]
            batch_loss, _ = sess.run([loss_op, train_op], feed_dict={input_placeholder: batch_inputs, label_placeholder: batch_labels})
            batch_loss_sum += batch_loss
        training_loss = batch_loss_sum / num_training_batches

        # Validation
        batch_loss_sum = 0
        for batch_num in range(num_validation_batches):
            batch_inputs = validation_input_data[batch_num * batch_size: (batch_num + 1) * batch_size]
            batch_labels = validation_label_data[batch_num * batch_size: (batch_num + 1) * batch_size]
            batch_loss, _ = sess.run([loss_op, train_op], feed_dict={input_placeholder: batch_inputs, label_placeholder: batch_labels})
            batch_loss_sum += batch_loss
        validation_loss = batch_loss_sum / num_validation_batches

GPU中的TensorFlow内存分配

0 个答案: