🧱 Neural Network Layers: nn.Module Mastery

The LEGO Factory Analogy 🏭

Imagine you’re the owner of a magical LEGO factory. Each room in your factory does one special job—some rooms paint bricks, others snap them together, and some check if everything looks right. In PyTorch, nn.Module is like the blueprint for building these rooms. Every piece of your neural network is a room (a module), and together they create amazing things!

🎯 What is nn.Module?

Think of nn.Module as the parent class for everything in your neural network. Just like how all dogs are animals, all neural network pieces are Modules.

import torch.nn as nn

# Every layer inherits from nn.Module
class MyRoom(nn.Module):
    def __init__(self):
        super().__init__()

Why does this matter?

PyTorch can find all your learnable weights automatically
You get save/load for free
Training mode switches work everywhere

🛠️ Creating Custom Modules

Let’s build our first factory room! A custom module is like designing your own LEGO brick.

import torch
import torch.nn as nn

class PaintRoom(nn.Module):
    def __init__(self, in_colors, out_colors):
        super().__init__()
        # This is our painting machine
        self.painter = nn.Linear(in_colors, out_colors)

    def forward(self, brick):
        # Paint the brick and return it
        return self.painter(brick)

The Recipe:

Inherit from nn.Module
Call super().__init__() first
Define your layers in __init__
Write the forward method

⚡ The Forward Method

The forward method is the conveyor belt of your factory room. When a brick comes in, what happens to it?

class MagicRoom(nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(10, 20)
        self.layer2 = nn.Linear(20, 5)

    def forward(self, x):
        # Step 1: First machine
        x = self.layer1(x)
        # Step 2: Add some magic (ReLU)
        x = torch.relu(x)
        # Step 3: Second machine
        x = self.layer2(x)
        return x

🎮 Pro Tip: Never call forward() directly! Use model(input) instead. PyTorch does extra magic behind the scenes.

graph TD
    A[Input Brick] --> B[layer1]
    B --> C[ReLU Magic]
    C --> D[layer2]
    D --> E[Output Brick]

💎 Parameters and Buffers

Your factory has two types of important things:

Parameters (Learnable Weights) 🎓

These are the knobs that PyTorch adjusts during training.

class SmartRoom(nn.Module):
    def __init__(self):
        super().__init__()
        # This creates a learnable parameter
        self.weight = nn.Parameter(
            torch.randn(3, 3)
        )

Buffers (Fixed Values) 📦

These are values you want to save but NOT train.

class SmartRoom(nn.Module):
    def __init__(self):
        super().__init__()
        # This is saved but not trained
        self.register_buffer(
            'my_constant',
            torch.tensor([1.0, 2.0, 3.0])
        )

Quick Comparison:

Type	Learnable?	Saved?	Example
Parameter	✅ Yes	✅ Yes	Weights
Buffer	❌ No	✅ Yes	Running mean

🔄 Module State Management

Your factory needs to remember things! State management is like having a save file for your game.

Saving Your Factory

# Save everything
torch.save(model.state_dict(), 'factory.pth')

Loading Your Factory

# Load it back
model.load_state_dict(
    torch.load('factory.pth')
)

Peeking Inside

# See all parameters
for name, param in model.named_parameters():
    print(f"{name}: {param.shape}")

# See all modules
for name, module in model.named_modules():
    print(f"{name}: {type(module)}")

🎭 Train and Eval Modes

Your factory has two modes—like a robot with a “learning” switch and a “working” switch.

Training Mode 🏋️

model.train()  # Learning mode ON

Dropout is active (randomly drops neurons)
BatchNorm uses batch statistics

Evaluation Mode 🎯

model.eval()  # Working mode ON

Dropout is disabled (all neurons work)
BatchNorm uses saved statistics

# Always do this for testing!
model.eval()
with torch.no_grad():
    output = model(test_data)

⚠️ Warning: Forgetting model.eval() before testing is a common bug that causes weird results!

📦 Sequential and Containers

What if you want to connect many rooms in a line? Use Sequential!

nn.Sequential - The Assembly Line

# Quick way to stack layers
model = nn.Sequential(
    nn.Linear(10, 20),
    nn.ReLU(),
    nn.Linear(20, 5)
)

# Input flows through in order
output = model(input)

graph TD
    A[Input] --> B[Linear 10→20]
    B --> C[ReLU]
    C --> D[Linear 20→5]
    D --> E[Output]

Named Sequential

# Give names to your layers
model = nn.Sequential(
    ('hidden', nn.Linear(10, 20)),
    ('activation', nn.ReLU()),
    ('output', nn.Linear(20, 5))
)

# Access by name
print(model.hidden)

📝 ModuleList - The Flexible Stack

When you need a list of layers but want PyTorch to track them:

class FlexibleFactory(nn.Module):
    def __init__(self, num_rooms):
        super().__init__()
        # ModuleList tracks all layers
        self.rooms = nn.ModuleList([
            nn.Linear(10, 10)
            for _ in range(num_rooms)
        ])

    def forward(self, x):
        for room in self.rooms:
            x = room(x)
        return x

❌ Don’t use regular Python lists! PyTorch won’t find those parameters.

# WRONG - Parameters are invisible!
self.layers = [nn.Linear(10, 10)]

# RIGHT - Parameters are tracked!
self.layers = nn.ModuleList([nn.Linear(10, 10)])

📚 ModuleDict - The Named Collection

When you want to access layers by name instead of position:

class SmartFactory(nn.Module):
    def __init__(self):
        super().__init__()
        self.rooms = nn.ModuleDict({
            'paint': nn.Linear(10, 20),
            'polish': nn.Linear(20, 20),
            'ship': nn.Linear(20, 5)
        })

    def forward(self, x, room_name):
        # Use any room by name!
        return self.rooms[room_name](x)

When to use which?

Container	Use When…
Sequential	Layers flow in order
ModuleList	Need index access
ModuleDict	Need name access

🎁 Putting It All Together

Here’s a complete factory that uses everything we learned:

class UltimateFactory(nn.Module):
    def __init__(self):
        super().__init__()

        # Sequential for main flow
        self.main_line = nn.Sequential(
            nn.Linear(784, 256),
            nn.ReLU(),
            nn.Linear(256, 128)
        )

        # ModuleList for repeating blocks
        self.extra_rooms = nn.ModuleList([
            nn.Linear(128, 128)
            for _ in range(3)
        ])

        # ModuleDict for named outputs
        self.outputs = nn.ModuleDict({
            'classify': nn.Linear(128, 10),
            'detect': nn.Linear(128, 4)
        })

        # Buffer for tracking
        self.register_buffer(
            'forward_count',
            torch.tensor(0)
        )

    def forward(self, x, task='classify'):
        x = self.main_line(x)

        for room in self.extra_rooms:
            x = torch.relu(room(x))

        return self.outputs[task](x)

🚀 Key Takeaways

nn.Module is the foundation of all neural network components
Always call super().__init__() in your custom modules
The forward method defines data flow
Parameters learn, Buffers don’t
Use train() and eval() to switch modes
Sequential = ordered layers, ModuleList = indexed layers, ModuleDict = named layers

🎯 Remember This!

Every neural network layer is a Module
├── Custom modules inherit from nn.Module
├── forward() defines what happens to data
├── Parameters are learned, Buffers are saved
├── train() vs eval() changes behavior
└── Containers organize your modules
    ├── Sequential → In order
    ├── ModuleList → By index
    └── ModuleDict → By name

You’re now ready to build any neural network architecture! 🎉

Loading story...

No Story Available

This concept doesn't have a story yet.

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Sign In to Access Get Premium Access Close

Interactive - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Sign In to Access Get Premium Access Close

No Interactive Content

This concept doesn't have interactive content yet.

Cheatsheet - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Sign In to Access Get Premium Access Close

No Cheatsheet Available

This concept doesn't have a cheatsheet yet.

Quiz - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Sign In to Access Get Premium Access Close

No Quiz Available

This concept doesn't have a quiz yet.

Unable to load concept

Coming Soon...

🧱 Neural Network Layers: nn.Module Mastery

The LEGO Factory Analogy 🏭

🎯 What is nn.Module?

🛠️ Creating Custom Modules

⚡ The Forward Method

💎 Parameters and Buffers

Parameters (Learnable Weights) 🎓

Buffers (Fixed Values) 📦

🔄 Module State Management

Saving Your Factory

Loading Your Factory

Peeking Inside

🎭 Train and Eval Modes

Training Mode 🏋️

Evaluation Mode 🎯

📦 Sequential and Containers

nn.Sequential - The Assembly Line

Named Sequential

📝 ModuleList - The Flexible Stack

📚 ModuleDict - The Named Collection

🎁 Putting It All Together

🚀 Key Takeaways

🎯 Remember This!

No Story Available

Story - Premium Content

Interactive - Premium Content

No Interactive Content

Cheatsheet - Premium Content

No Cheatsheet Available

Quiz - Premium Content

No Quiz Available

Report an Issue