What is vectorization in R?

Vectorization performs operations on entire vectors at once instead of looping. It's 10-100x faster because R sends all data to the CPU together.

How do you profile R code?

Use system.time() for quick timing, Rprof() for detailed analysis, or profvis for visual flame graphs showing where time is spent.

Why pre-allocate memory in R?

Growing vectors in loops forces R to copy data repeatedly. Pre-allocating with numeric() or vector() avoids this and runs much faster.

R Performance: Speed Up Your Code | R Guide

🚀 Advanced R Programming: Performance

Imagine your R code is a chef in a kitchen. A fast chef knows where every ingredient is, uses the right tools, and never wastes a single move. Today, we’ll teach your R code to cook like a pro!

🍳 The Kitchen Analogy

Think of your computer like a kitchen:

Memory = Your counter space (where you prep food)
Vectorization = Using a food processor instead of chopping by hand
Optimization = Finding the fastest recipe
Profiling = Using a timer to see what takes longest

Let’s make your R code a master chef!

📦 Memory Management

What is Memory?

Memory is like your kitchen counter. You only have so much space!

Simple Example:

If you put too many bowls on the counter, you can’t work
Your computer works the same way with data

Why Does Memory Matter?

# Bad: Making copies everywhere
x <- 1:1000000
y <- x      # Looks innocent...
y[1] <- 0   # R copies the WHOLE thing!

When you change y, R makes a complete copy of x. That’s like photocopying a whole cookbook just to fix one typo!

Smart Memory Tips

graph TD
    A["Create Data"] --> B{Need to Modify?}
    B -->|Yes| C["Modify In-Place"]
    B -->|No| D["Share Memory"]
    C --> E["Efficient!"]
    D --> E

Tip 1: Remove What You Don’t Need

# Free up space!
rm(big_data)
gc()  # Garbage collection = cleaning

Tip 2: Check Your Memory Usage

# How big is my data?
object.size(my_data)

# See all objects
ls()

Tip 3: Pre-allocate Space

# Bad: Growing a vector
result <- c()
for(i in 1:1000) {
  result <- c(result, i)  # Slow!
}

# Good: Pre-allocate
result <- numeric(1000)
for(i in 1:1000) {
  result[i] <- i  # Fast!
}

🎯 Key Insight: Pre-allocation is like setting out all your bowls before cooking. No running to the cabinet mid-recipe!

⚡ Vectorization Benefits

What is Vectorization?

Imagine you need to peel 100 potatoes:

Loop way: Peel one, put down, pick up next, peel…
Vectorized way: Machine peels all 100 at once!

R is AMAZING at doing things all at once.

The Magic of Vectors

# Slow loop way (peeling one by one)
numbers <- 1:1000000
result <- numeric(1000000)
for(i in 1:1000000) {
  result[i] <- numbers[i] * 2
}

# Fast vectorized way (all at once!)
result <- numbers * 2

The vectorized way is 10-100x faster!

Why Vectorization is Fast

graph TD
    A["Loop"] --> B["Check i"]
    B --> C["Get value"]
    C --> D["Calculate"]
    D --> E["Store"]
    E --> F["Repeat 1M times"]

    G["Vectorized"] --> H["Send all to CPU"]
    H --> I["CPU does all at once"]
    I --> J["Done!"]

Common Vectorized Functions

Instead of Loop	Use This
`for` + `sum()`	`sum(x)`
`for` + `mean()`	`mean(x)`
`for` + comparison	`x > 5`
`for` + math	`x * 2`

Real Example:

# Find all numbers > 50
numbers <- 1:100

# Loop (slow)
big_ones <- c()
for(n in numbers) {
  if(n > 50) big_ones <- c(big_ones, n)
}

# Vectorized (fast!)
big_ones <- numbers[numbers > 50]

🎯 Key Insight: If you’re writing a loop in R, ask yourself: “Can I do this all at once?”

🏎️ Performance Optimization

The Golden Rules

Measure first, optimize second
Don’t optimize code that runs once
Make it work, then make it fast

Common Speed Killers

graph TD
    A["Slow Code"] --> B["Growing Objects"]
    A --> C["Unnecessary Copies"]
    A --> D["Loops over Vectors"]
    A --> E["Reading Files Repeatedly"]

Quick Wins

1. Use Built-in Functions

# Slow
my_sum <- 0
for(x in numbers) my_sum <- my_sum + x

# Fast (built-in!)
my_sum <- sum(numbers)

2. Avoid Growing Objects

# Bad: List grows each time
results <- list()
for(i in 1:1000) {
  results[[i]] <- do_something(i)
}

# Better: Use lapply
results <- lapply(1:1000, do_something)

3. Read Data Once

# Bad: Reading in a loop
for(i in 1:10) {
  data <- read.csv("file.csv")  # Re-reads!
}

# Good: Read once, use many
data <- read.csv("file.csv")
for(i in 1:10) {
  process(data)  # Uses cached data
}

The apply Family

Function	When to Use
`lapply`	Apply to each list item
`sapply`	Same, but simplify result
`vapply`	Same, but specify output
`mapply`	Multiple inputs

# Square each number
numbers <- list(1, 2, 3, 4, 5)
squares <- lapply(numbers, function(x) x^2)
# Result: list(1, 4, 9, 16, 25)

🎯 Key Insight: R’s built-in functions are written in C. They’re MUCH faster than loops!

🔍 Profiling Code

What is Profiling?

Profiling = Finding the slow parts of your code

Like using a stopwatch to time each step of cooking!

The Simple Way: system.time()

# How long does this take?
system.time({
  result <- sum(1:10000000)
})
# user  system elapsed
# 0.05   0.00    0.05

user: CPU time for your code
system: CPU time for system tasks
elapsed: Real wall-clock time

The Pro Way: Rprof()

# Start profiling
Rprof("my_profile.out")

# Run your code
my_slow_function()

# Stop profiling
Rprof(NULL)

# See results
summaryRprof("my_profile.out")

Visual Profiling with profvis

# Install once
install.packages("profvis")

# Profile with pretty pictures!
library(profvis)
profvis({
  # Your code here
  data <- read.csv("big_file.csv")
  result <- process(data)
})

This shows a beautiful flame graph of where time is spent!

Reading Profile Results

graph TD
    A["Total Time: 10 sec"] --> B["read_data: 6 sec"]
    A --> C["process: 3 sec"]
    A --> D["save: 1 sec"]
    B --> E["Focus here first!"]

The 80/20 Rule: Usually 20% of your code takes 80% of the time. Find that 20%!

Benchmarking with microbenchmark

library(microbenchmark)

# Compare two approaches
microbenchmark(
  loop = {
    s <- 0
    for(i in 1:1000) s <- s + i
  },
  vectorized = sum(1:1000),
  times = 100
)

This runs each version 100 times and shows you statistics!

🎯 Key Insight: Never guess where your code is slow. Measure it!

🎓 Putting It All Together

The Performance Checklist

graph TD
    A["Slow Code?"] --> B["Profile First"]
    B --> C["Find Bottleneck"]
    C --> D{What's Slow?}
    D -->|Memory| E["Check Copies"]
    D -->|Loops| F["Try Vectorization"]
    D -->|Functions| G["Use Built-ins"]
    E --> H["Optimize"]
    F --> H
    G --> H
    H --> I["Profile Again"]
    I --> J{Fast Enough?}
    J -->|No| B
    J -->|Yes| K["Done!"]

Real-World Example

Before (Slow):

# Process 1 million rows
result <- c()
for(i in 1:nrow(data)) {
  if(data$value[i] > 100) {
    result <- c(result, data$value[i] * 2)
  }
}

After (Fast):

# Vectorized approach
result <- data$value[data$value > 100] * 2

Speed Improvement: 100x faster!

🌟 Summary

Topic	Key Takeaway
Memory	Pre-allocate, remove unused, avoid copies
Vectorization	Do things all at once, not one by one
Optimization	Use built-ins, avoid growing objects
Profiling	Measure first, then optimize

Your New Superpowers

✅ You know how to check memory usage
✅ You can write vectorized code
✅ You understand the apply family
✅ You can profile and find slow code

“The best code is code that doesn’t waste a single CPU cycle—just like the best chef doesn’t waste a single ingredient!”

Go forth and write FAST R code! 🚀

Performance

Unable to load concept

Coming Soon...

🚀 Advanced R Programming: Performance

🍳 The Kitchen Analogy

📦 Memory Management

What is Memory?

Why Does Memory Matter?

Smart Memory Tips

⚡ Vectorization Benefits

What is Vectorization?

The Magic of Vectors

Why Vectorization is Fast

Common Vectorized Functions

🏎️ Performance Optimization

The Golden Rules

Common Speed Killers

Quick Wins

The apply Family

🔍 Profiling Code

What is Profiling?

The Simple Way: system.time()

The Pro Way: Rprof()

Visual Profiling with profvis

Reading Profile Results

Benchmarking with microbenchmark

🎓 Putting It All Together

The Performance Checklist

Real-World Example

🌟 Summary

Your New Superpowers

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue