1. Tensor on GPU
-
DL算法需大量数值运算,默认,这些操作通常在CPU上完成,但GPU通常比CPU更快执行
神经网络所需的特定类型的操作(Operation:matrix multiplication); -
注:本文指的GPU是启用CUDA的Nvidia GPU,
CUDA是计算平台和API,有助于通用计算,而不仅仅是图形;
2. Getting GPU
-
A:Own Machine:https://pytorch.org/get-started/locally
-
B:Google Colab:Edit ☞ Notebook Setting;
!nvidia-smi
/bin/bash: line 1: nvidia-smi: command not found
Sun May 26 03:26:55 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05 Driver Version: 535.104.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 78C P0 35W / 70W | 1215MiB / 15360MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
+---------------------------------------------------------------------------------------+
Colab paid products - Cancel contracts here
3. Getting Pytorch
-
getting PyTorch to run on GPU;
-
PyTorch可使用torch.cuda包来存储数据(张量)和计算数据(对张量执行操作);
import torch
torchVersion = torch.__version__
cudaAvailable = torch.cuda.is_available()
deviceType = "cuda" if torch.cuda.is_available() else "cpu"
deviceNumber = torch.cuda.device_count()
print('torchVersion:{} \n cudaVersion:{} \n deviceType:{} \n deviceNumber:{}'
.format(torchVersion,cudaAvailable,deviceType,deviceNumber))
torchVersion:2.3.0+cu121
cudaVersion:True
deviceType:cuda
deviceNumber:1
4. Putting Tensor Model
-
putting tensor and model on GPU;
-
可通过调用特定设备上的to(device),将张量(和模型)put
到特定设备上,device是我们希望张量或模型转到的目标设备; -
GPU提供的数值计算速度比CPU快得多,若device agnostic code,那它将在CPU上执行;
https://pytorch.org/docs/master/notes/cuda.html#device-agnostic-code
|
import torch
# create tensor,default on CPU
tensor = torch.tensor([1,2,3])
# tensor not on GPU
print(tensor,tensor.device)
# move tensor to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
tensor_on_gpu = tensor.to(device)
tensor_on_gpu
tensor([1, 2, 3]) cpu
tensor([1, 2, 3], device='cuda:0')
5. Moving Back
-
moving tensor back to CPU
-
若想将张量移回CPU,如想用Numpy与张量交互,Numpy不利用GPU,
则需执行此操作,试试torch.Tensor.numpy(); -
相反:为将张量返回CPU与Numpy合用,可用Tensor.cpu(),
将张量复制到CPU内存,使其可用于CPU,
import torch
# create tensor,default on CPU
tensor = torch.tensor([1,2,3])
# tensor not on GPU
print(tensor,tensor.device)
# move tensor to GPU if available
device = "cuda" if torch.cuda.is_available() else "cpu"
tensor_on_gpu = tensor.to(device)
print(tensor_on_gpu)
# if tensor on GPU, can't transform it to NumPy (exception)
# TypeError: can't convert cuda:0 device type tensor to numpy.
# Use Tensor.cpu() to copy the tensor to host memory first.
# tensor_on_gpu.numpy()
# copy tensor back to cpu instead
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
print(tensor_back_on_cpu)
-
返回CPU内存中GPU张量的副本,原始张量仍在GPU上;
tensor([1, 2, 3]) cpu
tensor([1, 2, 3], device='cuda:0')
[1 2 3]