Tensorflow berlatih di CPU, bukan di GPU seri RTX 3000

Nov 28 2020

Saya mencoba melatih model tensorflow saya pada GPU RTX 3070 saya. Saya menggunakan lingkungan virtual anaconda dan prompt menunjukkan bahwa GPU berhasil dideteksi dan tidak menampilkan kesalahan atau peringatan apa pun, tetapi setiap kali model mulai melatih, ia menggunakan CPU sebagai gantinya.

Anjuran Anaconda Saya:

2020-11-28 19:38:17.373117: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2020-11-28 19:38:17.378626: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-28 19:38:17.378679: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-28 19:38:17.381802: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2020-11-28 19:38:17.382739: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2020-11-28 19:38:17.389401: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2020-11-28 19:38:17.391830: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2020-11-28 19:38:17.392332: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-28 19:38:17.392422: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1866] Adding visible gpu devices: 0
2020-11-28 19:38:26.072912: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-28 19:38:26.073904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1724] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: GeForce RTX 3070 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 46 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 417.29GiB/s
2020-11-28 19:38:26.073984: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2020-11-28 19:38:26.074267: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-28 19:38:26.074535: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-28 19:38:26.074775: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2020-11-28 19:38:26.075026: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2020-11-28 19:38:26.075275: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2020-11-28 19:38:26.075646: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2020-11-28 19:38:26.075871: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-28 19:38:26.076139: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1866] Adding visible gpu devices: 0
2020-11-28 19:38:26.738596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1265] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-28 19:38:26.738680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1271]      0
2020-11-28 19:38:26.739375: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1284] 0:   N
2020-11-28 19:38:26.740149: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1410] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6589 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3070, pci bus id: 0000:08:00.0, compute capability: 8.6)
2020-11-28 19:38:26.741055: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2020-11-28 19:38:28.028828: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:126] None of the MLIR optimization passes are enabled (registered 2)
2020-11-28 19:38:32.428408: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2020-11-28 19:38:33.305827: I tensorflow/stream_executor/cuda/cuda_dnn.cc:344] Loaded cuDNN version 8004
2020-11-28 19:38:33.753275: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2020-11-28 19:38:34.603341: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2020-11-28 19:38:34.610934: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.

Kode Model Saya:

inputs = keras.Input(shape=(None,), dtype="int32")
x = layers.Embedding(max_features, 128)(inputs)
x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
x = layers.Bidirectional(layers.LSTM(64))(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)

model.compile("adam", "binary_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=32, epochs=2, validation_data=(x_val, y_val))

Saya menggunakan:

  • tensorflow nightly gpu 2.5.0.dev20201111 (diinstal di env virtual anaconda)
  • CUDA 11.1 (cuda_11.1.1_456.81)
  • CUDNN v8.0.4.30 (untuk CUDA 11.1)
  • python 3.8.0

Saya tahu bahwa GPU saya tidak digunakan karena penggunaannya 1% sedangkan CPU saya 60% dengan proses teratasnya adalah python.

Adakah yang bisa membantu saya mendapatkan pelatihan model saya menggunakan GPU?

Jawaban

TarakNathNandi Nov 29 2020 at 01:57

Kemungkinan besar Anda menggunakan Tensorflow untuk CPU, bukan untuk GPU. Lakukan "pip uninstall tensorflow" dan "pip install tensorflow-gpu" untuk menginstal yang sesuai untuk menggunakan GPU.