安装具有GPU支持的Tensorflow时遇到的问题

时间:2020-05-14 19:04:03

标签: python tensorflow

我在机器上安装了tensorflow gpu build。我的tensorflow版本是2.1.0,而cuda是10.0。

当我检查计算机中是否有可用的GPU时,它无法识别图形卡。

这是我的nvidia-smi命令的输出。

mymachine@titanx:~$ nvidia-smi
Fri May 15 00:08:51 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78       Driver Version: 410.78       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 00000000:02:00.0  On |                  N/A |
| 22%   52C    P8    19W / 250W |    576MiB / 12211MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1290      G   /usr/lib/xorg/Xorg                           182MiB |
|    0      1982      G   /opt/teamviewer/tv_bin/TeamViewer             19MiB |
|    0      2483      G   compiz                                       157MiB |
|    0      6030      C   python                                       104MiB |

这是我的错误信息。

mymachine@titanx:~$ python
Python 3.7.4 (default, Aug 13 2019, 20:35:49) 
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
2020-05-15 00:15:50.850372: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:15:50.850454: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:15:50.850467: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
>>> tf.test.
tf.test.Benchmark(                  tf.test.benchmark_config(           tf.test.gpu_device_name(            tf.test.is_built_with_rocm(         
tf.test.TestCase(                   tf.test.compute_gradient(           tf.test.is_built_with_cuda(         tf.test.is_gpu_available(           
tf.test.assert_equal_graph_def(     tf.test.create_local_cluster(       tf.test.is_built_with_gpu_support(  tf.test.main(                       
>>> tf.test.is_built_with_gpu_support()
True
>>> tf.test.is_gpu_available()
WARNING:tensorflow:From <stdin>:1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.config.list_physical_devices('GPU')` instead.
2020-05-15 00:16:45.492044: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-15 00:16:45.526439: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3491695000 Hz
2020-05-15 00:16:45.527074: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e3d21f7370 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-05-15 00:16:45.527125: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-05-15 00:16:45.532644: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-15 00:16:45.623845: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 00:16:45.624424: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55e3d22c7d80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-15 00:16:45.624465: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce GTX TITAN X, Compute Capability 5.2
2020-05-15 00:16:45.624784: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-15 00:16:45.625640: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:02:00.0 name: GeForce GTX TITAN X computeCapability: 5.2
coreClock: 1.076GHz coreCount: 24 deviceMemorySize: 11.92GiB deviceMemoryBandwidth: 313.37GiB/s
2020-05-15 00:16:45.625832: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.625967: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.626095: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.626236: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.626389: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.626527: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: :/usr/local/cuda/extras/CUPTI/lib64
2020-05-15 00:16:45.632796: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-15 00:16:45.632837: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1592] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-05-15 00:16:45.632869: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-15 00:16:45.632888: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-05-15 00:16:45.632903: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
False
>>> tf.__version__
'2.1.0'

所以我尝试安装libnvinfer6和其他实际库(如TensorFlow Official Documentation中所述,但我收到未找到包错误。

mymachine@titanx:~$ sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 \
>     libnvinfer-dev=6.0.1-1+cuda10.1 \
>     libnvinfer-plugin6=6.0.1-1+cuda10.1
[sudo] password for dtai: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package libnvinfer6
E: Unable to locate package libnvinfer-dev
E: Unable to locate package libnvinfer-plugin6

我可以看到官方文档提到了我安装cuda 10.0的cuda 10.1。 在Cuda 10.0中安装这些更新的正确命令是什么,以及如何解决“无法找到软件包”的问题?

0 个答案:

没有答案