GPU and Device Management

Loading concept...

GPU and Device Management in PyTorch

The Story: Your Tensor’s Magical Moving Day 🚚

Imagine your tensor is a tiny worker bee 🐝 living in a big city. The city has different neighborhoods:

  • CPU Land – A quiet suburb where everyone can live (your regular computer memory)
  • CUDA City – A super-fast downtown with skyscrapers (NVIDIA GPU)
  • MPS Village – A cozy Apple neighborhood (Apple Silicon GPU)
  • Meta Cloud – A magical place where things exist but take no space (for planning)

Your tensor worker bee can live in any of these places, but to work together, bees must be in the same neighborhood!


1. Tensor Device Placement

What is a “Device”?

A device is simply where your tensor lives in your computer’s memory.

Think of it like choosing which room to put your toys in:

  • Some toys go in the living room (CPU)
  • Some toys go in the game room (GPU)

Creating Tensors on Specific Devices

import torch

# Lives in CPU Land (default)
cpu_tensor = torch.tensor([1, 2, 3])

# Lives in CUDA City (GPU)
gpu_tensor = torch.tensor([1, 2, 3],
                          device='cuda')

# Lives in MPS Village (Apple GPU)
mps_tensor = torch.tensor([1, 2, 3],
                          device='mps')

Checking Where Your Tensor Lives

my_tensor = torch.tensor([1, 2, 3])
print(my_tensor.device)
# Output: cpu

gpu_tensor = my_tensor.to('cuda')
print(gpu_tensor.device)
# Output: cuda:0

💡 Simple Rule: The :0 means “first GPU”. If you have two GPUs, they’re cuda:0 and cuda:1.


2. Moving Tensors Between Devices

The Golden Rule ⚠️

Tensors must be on the SAME device to work together!

This is like saying: “Two people can only high-five if they’re in the same room.”

# ❌ THIS BREAKS!
cpu_tensor = torch.tensor([1, 2, 3])
gpu_tensor = torch.tensor([4, 5, 6],
                          device='cuda')
# result = cpu_tensor + gpu_tensor
# ERROR! Different devices!

# ✅ THIS WORKS!
cpu_tensor = cpu_tensor.to('cuda')
result = cpu_tensor + gpu_tensor
# Both in CUDA City now!

Three Ways to Move Your Tensor

my_tensor = torch.tensor([1, 2, 3])

# Method 1: .to() - Most flexible
gpu_tensor = my_tensor.to('cuda')

# Method 2: .cuda() - Quick shortcut
gpu_tensor = my_tensor.cuda()

# Method 3: .cpu() - Back to CPU
back_to_cpu = gpu_tensor.cpu()

The “Copy vs Move” Secret

original = torch.tensor([1, 2, 3])
moved = original.to('cuda')

# original still exists on CPU!
# moved is a NEW copy on GPU!

🎯 Key Insight: .to() creates a COPY. Your original tensor stays where it was.


3. CUDA Device Management

Is CUDA Available?

Before moving to GPU, always ask: “Is the GPU home open?”

# Check if CUDA is available
if torch.cuda.is_available():
    print("GPU ready!")
    device = 'cuda'
else:
    print("Using CPU")
    device = 'cpu'

How Many GPUs Do I Have?

gpu_count = torch.cuda.device_count()
print(f"You have {gpu_count} GPUs!")

Which GPU Am I Using?

# Get current GPU index
current = torch.cuda.current_device()
print(f"Using GPU #{current}")

# Get GPU name
name = torch.cuda.get_device_name(0)
print(f"GPU name: {name}")

Memory Management

GPUs have limited memory. Here’s how to check:

# See memory usage (in bytes)
allocated = torch.cuda.memory_allocated()
reserved = torch.cuda.memory_reserved()

print(f"Used: {allocated / 1e9:.2f} GB")
print(f"Reserved: {reserved / 1e9:.2f} GB")

# Clear unused memory
torch.cuda.empty_cache()

4. Multi-GPU Setup Basics

Choosing a Specific GPU

# Method 1: Specify in .to()
tensor_on_gpu1 = my_tensor.to('cuda:1')

# Method 2: Set default device
torch.cuda.set_device(1)
# Now all new CUDA tensors go to GPU 1

The Device-Agnostic Pattern ✨

Write code that works everywhere:

# Smart device selection
device = torch.device(
    'cuda' if torch.cuda.is_available()
    else 'cpu'
)

# Create tensor on best available device
my_tensor = torch.tensor([1, 2, 3],
                         device=device)
model = MyModel().to(device)
graph TD A[Start] --> B{CUDA Available?} B -->|Yes| C[Use cuda] B -->|No| D{MPS Available?} D -->|Yes| E[Use mps] D -->|No| F[Use cpu]

5. MPS for Apple Silicon

What is MPS?

MPS = Metal Performance Shaders

It’s Apple’s way to use the GPU on M1/M2/M3 chips!

Checking MPS Availability

# Is MPS available?
if torch.backends.mps.is_available():
    device = torch.device('mps')
    print("Using Apple GPU!")
else:
    device = torch.device('cpu')

Using MPS

# Create tensor on Apple GPU
mps_tensor = torch.tensor([1, 2, 3],
                          device='mps')

# Move existing tensor
cpu_tensor = torch.tensor([4, 5, 6])
mps_tensor = cpu_tensor.to('mps')

The Universal Device Selector

def get_device():
    if torch.cuda.is_available():
        return torch.device('cuda')
    elif torch.backends.mps.is_available():
        return torch.device('mps')
    else:
        return torch.device('cpu')

device = get_device()

6. Meta Device

What is the Meta Device?

The meta device is like a blueprint.

Imagine you want to build a huge LEGO castle:

  • Instead of buying all the bricks first…
  • You draw a plan showing how big it will be
  • Meta tensors are just the plan, not the actual bricks!

Why Use Meta?

  • Save Memory: Plan big models without using RAM
  • Check Shapes: See if your model fits before building it

Creating Meta Tensors

# Create a "ghost" tensor - shape only!
meta_tensor = torch.empty(
    1000, 1000,
    device='meta'
)

# It has shape but no memory!
print(meta_tensor.shape)  # (1000, 1000)
print(meta_tensor.device) # meta
# But NO actual numbers inside!

Practical Example: Testing Model Size

# Test if a huge model fits
with torch.device('meta'):
    huge_model = BigModel()

# Check parameter count without
# using any real memory!
params = sum(
    p.numel()
    for p in huge_model.parameters()
)
print(f"Model has {params:,} parameters")

Quick Reference Table

Device When to Use Check Availability
cpu Always works Always available
cuda NVIDIA GPU torch.cuda.is_available()
mps Apple M-chip torch.backends.mps.is_available()
meta Planning only Always available

The Complete Smart Device Pattern

import torch

def smart_device():
    """Pick the best available device."""
    if torch.cuda.is_available():
        return 'cuda'
    if torch.backends.mps.is_available():
        return 'mps'
    return 'cpu'

# Use it everywhere!
device = smart_device()
data = torch.randn(100, device=device)
model = MyModel().to(device)

Summary: Your Tensor’s Home Address 🏠

graph TD T[Your Tensor] --> Q{Where to live?} Q --> CPU[CPU: Safe, Slow] Q --> CUDA[CUDA: Fast, NVIDIA] Q --> MPS[MPS: Fast, Apple] Q --> META[Meta: Planning Only] CPU --> RULE[Same device = Can work together!] CUDA --> RULE MPS --> RULE

Remember:

  1. ✅ Check device availability before using
  2. ✅ Keep tensors on the same device for operations
  3. ✅ Use .to(device) to move tensors
  4. ✅ Write device-agnostic code for portability
  5. ✅ Use meta device for planning big models

You’re now ready to manage your tensors across any device! 🚀

Loading story...

No Story Available

This concept doesn't have a story yet.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Interactive Preview

Interactive - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Interactive Content

This concept doesn't have interactive content yet.

Cheatsheet Preview

Cheatsheet - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Cheatsheet Available

This concept doesn't have a cheatsheet yet.

Quiz Preview

Quiz - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Quiz Available

This concept doesn't have a quiz yet.