我的问题是代码中应该进行哪些更改

时间:2019-08-20 18:30:32

标签: tensorflow out-of-memory conv-neural-network tf.keras

问题

我正在使用Tensorflow在我的GPU上训练CNN模型,但内存用完了

我尝试过的事情

我尝试更改batch_size,但发生了积极变化,但最终内存不足

model = Sequential()

CODE

enter code here

model.add(Conv2D(64, (3, 3), input_shape=X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())
model.add(Dense(64))

model.add(Dense(1))
model.add(Activation("sigmoid"))

model.compile(loss="binary_crossentropy",optimizer="adam",metrics= 
['accuracy'])
model.fit(X, Y, batch_size=32, validation_split=0.1)

输出

C:\Anaconda3\envs\tutorial\pythonw.exe "C:/Users/roshaan zafar/PycharmProjects/InternshipRiseTech/main.py"
WARNING: Logging before flag parsing goes to stderr.
W0820 13:05:23.726494 24488 deprecation.py:506] From C:\Anaconda3\envs\tutorial\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0820 13:05:23.817250 24488 deprecation.py:323] From C:\Anaconda3\envs\tutorial\lib\site-packages\tensorflow\python\ops\nn_impl.py:180: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 360 samples, validate on 40 samples
2019-08-20 13:05:24.028720: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-08-20 13:05:24.030744: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library nvcuda.dll
2019-08-20 13:05:24.976333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce GTX 1070 major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
2019-08-20 13:05:24.976601: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-08-20 13:05:24.977484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-20 13:05:25.734584: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-20 13:05:25.734785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-08-20 13:05:25.734905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-08-20 13:05:25.735694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6376 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
2019-08-20 13:05:26.180767: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:26.834340: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:27.476075: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:28.102630: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:28.715843: W tensorflow/core/framework/allocator.cc:107] Allocation of 1240006656 exceeds 10% of system memory.
2019-08-20 13:05:47.982488: W tensorflow/core/common_runtime/bfc_allocator.cc:314] Allocator (GPU_0_bfc) ran out of memory trying to allocate 9.34GiB (rounded to 10029662208).  Current allocation summary follows.
2019-08-20 13:05:47.983224: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (256):   Total Chunks: 47, Chunks in use: 47. 11.8KiB allocated for chunks. 11.8KiB in use in bin. 1.5KiB client-requested in use in bin.
2019-08-20 13:05:47.983956: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (512):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.984651: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1024):  Total Chunks: 1, Chunks in use: 1. 1.3KiB allocated for chunks. 1.3KiB in use in bin. 1.0KiB client-requested in use in bin.
2019-08-20 13:05:47.985413: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2048):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.986243: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4096):  Total Chunks: 2, Chunks in use: 2. 13.5KiB allocated for chunks. 13.5KiB in use in bin. 13.5KiB client-requested in use in bin.
2019-08-20 13:05:47.988224: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8192):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.988864: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16384):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.989820: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (32768):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.990495: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (65536):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.991146: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (131072):    Total Chunks: 2, Chunks in use: 2. 288.0KiB allocated for chunks. 288.0KiB in use in bin. 288.0KiB client-requested in use in bin.
2019-08-20 13:05:47.992567: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (262144):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.993545: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (524288):    Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.994186: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (1048576):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.994859: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (2097152):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.995569: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (4194304):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.996235: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (8388608):   Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.996924: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (16777216):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.997650: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (33554432):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.998404: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (67108864):  Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.999135: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (134217728):     Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2019-08-20 13:05:47.999876: I tensorflow/core/common_runtime/bfc_allocator.cc:764] Bin (268435456):     Total Chunks: 5, Chunks in use: 3. 6.23GiB allocated for chunks. 2.75GiB in use in bin. 2.75GiB client-requested in use in bin.
2019-08-20 13:05:48.000650: I tensorflow/core/common_runtime/bfc_allocator.cc:780] Bin for 9.34GiB was 256.00MiB, Chunk State: 
2019-08-20 13:05:48.001093: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 450.00MiB | Requested Size: 450.00MiB | in_use: 0 | bin_num: 20, prev:   Size: 256B | Requested Size: 8B | in_use: 1 | bin_num: -1, next:   Size: 256B | Requested Size: 128B | in_use: 1 | bin_num: -1
2019-08-20 13:05:48.003835: I tensorflow/core/common_runtime/bfc_allocator.cc:786]   Size: 3.04GiB | Requested Size: 0B | in_use: 0 | bin_num: 20, prev:   Size: 256B | Requested Size: 4B | in_use: 1 | bin_num: -1
2019-08-20 13:05:48.004577: I tensorflow/core/common_runtime/bfc_allocator.cc:793] Next region of size 6686052608
2019-08-20 13:05:48.013828: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400000 next 1 of size 1280
2019-08-20 13:05:48.014294: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400500 next 2 of size 256
2019-08-20 13:05:48.014708: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400600 next 3 of size 256
2019-08-20 13:05:48.015131: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400700 next 4 of size 256
2019-08-20 13:05:48.015622: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400800 next 5 of size 256
2019-08-20 13:05:48.016053: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400900 next 6 of size 256
2019-08-20 13:05:48.016492: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400A00 next 7 of size 256
2019-08-20 13:05:48.016914: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400B00 next 8 of size 256
2019-08-20 13:05:48.017347: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400C00 next 9 of size 256
2019-08-20 13:05:48.017774: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400D00 next 10 of size 256
2019-08-20 13:05:48.018202: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400E00 next 11 of size 256
2019-08-20 13:05:48.019604: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705400F00 next 12 of size 256
2019-08-20 13:05:48.020000: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401000 next 13 of size 256
2019-08-20 13:05:48.020407: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401100 next 14 of size 256
2019-08-20 13:05:48.020801: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401200 next 15 of size 256
2019-08-20 13:05:48.021203: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401300 next 16 of size 256
2019-08-20 13:05:48.022177: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401400 next 17 of size 256
2019-08-20 13:05:48.022845: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401500 next 18 of size 256
2019-08-20 13:05:48.023458: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000705401600 next 19 of size 1240006656
2019-08-20 13:05:48.024110: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F291600 next 20 of size 256
2019-08-20 13:05:48.024721: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F291700 next 21 of size 147456
2019-08-20 13:05:48.025371: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B5700 next 22 of size 6912
2019-08-20 13:05:48.026024: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7200 next 23 of size 256
2019-08-20 13:05:48.026686: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7300 next 24 of size 256
2019-08-20 13:05:48.027396: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000074F2B7400 next 25 of size 1240006656
2019-08-20 13:05:48.027798: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 0000000799147400 next 26 of size 147456
2019-08-20 13:05:48.028202: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916B400 next 27 of size 6912
2019-08-20 13:05:48.028598: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916CF00 next 28 of size 256
2019-08-20 13:05:48.028990: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D000 next 29 of size 256
2019-08-20 13:05:48.029868: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D100 next 30 of size 256
2019-08-20 13:05:48.030492: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D200 next 31 of size 256
2019-08-20 13:05:48.030887: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D300 next 32 of size 256
2019-08-20 13:05:48.031538: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D400 next 33 of size 256
2019-08-20 13:05:48.031931: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D500 next 34 of size 256
2019-08-20 13:05:48.032327: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D600 next 35 of size 256
2019-08-20 13:05:48.032722: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 000000079916D700 next 36 of size 256
2019-08-20 13:05:48.033116: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 000000079916D800 next 37 of size 471859200
2019-08-20 13:05:48.034291: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536D800 next 38 of size 256
2019-08-20 13:05:48.034879: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536D900 next 39 of size 256
2019-08-20 13:05:48.035434: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536DA00 next 40 of size 256
2019-08-20 13:05:48.035832: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007B536DB00 next 41 of size 471859200
2019-08-20 13:05:48.036554: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DB00 next 42 of size 256
2019-08-20 13:05:48.037253: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DC00 next 43 of size 256
2019-08-20 13:05:48.037949: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DD00 next 44 of size 256
2019-08-20 13:05:48.038697: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DE00 next 45 of size 256
2019-08-20 13:05:48.039204: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156DF00 next 46 of size 256
2019-08-20 13:05:48.039676: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E000 next 47 of size 256
2019-08-20 13:05:48.040135: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E100 next 48 of size 256
2019-08-20 13:05:48.041145: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E200 next 49 of size 256
2019-08-20 13:05:48.041535: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E300 next 50 of size 256
2019-08-20 13:05:48.041819: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E400 next 51 of size 256
2019-08-20 13:05:48.042130: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E500 next 52 of size 256
2019-08-20 13:05:48.042426: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E600 next 53 of size 256
2019-08-20 13:05:48.042713: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E700 next 54 of size 256
2019-08-20 13:05:48.043016: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E800 next 55 of size 256
2019-08-20 13:05:48.043276: I tensorflow/core/common_runtime/bfc_allocator.cc:800] InUse at 00000007D156E900 next 56 of size 256
2019-08-20 13:05:48.043572: I tensorflow/core/common_runtime/bfc_allocator.cc:800] Free  at 00000007D156EA00 next 18446744073709551615 of size 3261998848
2019-08-20 13:05:48.043902: I tensorflow/core/common_runtime/bfc_allocator.cc:809]      Summary of in-use Chunks by size: 
2019-08-20 13:05:48.044196: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 47 Chunks of size 256 totalling 11.8KiB
2019-08-20 13:05:48.044466: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 1280 totalling 1.3KiB
2019-08-20 13:05:48.044760: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 6912 totalling 13.5KiB
2019-08-20 13:05:48.045032: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 147456 totalling 288.0KiB
2019-08-20 13:05:48.045250: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 1 Chunks of size 471859200 totalling 450.00MiB
2019-08-20 13:05:48.045553: I tensorflow/core/common_runtime/bfc_allocator.cc:812] 2 Chunks of size 1240006656 totalling 2.31GiB
2019-08-20 13:05:48.045830: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 2.75GiB
2019-08-20 13:05:48.046120: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 6686052608 memory_limit_: 6686052843 available bytes: 235 curr_region_allocation_bytes_: 13372105728
2019-08-20 13:05:48.046453: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats: 
Limit:                  6686052843
InUse:                  2952194560
MaxInUse:               3424053760
NumAllocs:                      64
MaxAllocSize:           1240006656

2019-08-20 13:05:48.046834: W tensorflow/core/common_runtime/bfc_allocator.cc:319] **************************************______********________________________________________________
2019-08-20 13:05:48.052167: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[32,64,1278,958] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc


2019-08-20 13:05:48.052167: W tensorflow/core/framework/op_kernel.cc:1502] OP_REQUIRES failed at conv_ops.cc:486 : Resource exhausted: OOM when allocating tensor with shape[32,64,1278,958] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
  File "C:/Users/roshaan zafar/PycharmProjects/InternshipRiseTech/main.py", line 109, in <module>
    model.fit(X, Y, batch_size=32, validation_split=0.1)

2 个答案:

答案 0 :(得分:0)

由于您的网络非常小,并且每批次仅拍摄32张图像,因此可能是图像分辨率很高,在这种情况下,您可以尝试关注

  • 尝试减小图像的尺寸,但在执行此操作时请务必保留原始宽高比
  • 尝试再次以相同的分辨率提取较小尺寸图像的随机色块
  • 最后,如果上述解决方案不起作用,您可以尝试将Batch_Size进一步减小到8或4

与具有相同内存量的相应CPU相比,GPU内存通常确实会快速填充。希望这会有所帮助。

答案 1 :(得分:0)

展平后网络中的特征向量维为1278 x958。您的内存中将有64个(总滤波器)x 1278 x 958 x 64(密集单位)变量(不考虑偏置变量)。这个数字确实很大,可以由您的GPU处理。

考虑将输入图像调整为较小尺寸,否则,请考虑在网络中添加更多层(带有maxpooling的Conv2d)。最后一种选择是考虑用GlobalMaxPooling或GlobalAveragePooling替换平坦层。

相关问题