Factors

Loading concept...

🏷️ Factors in R: The Magic Labels That Organize Your World

The Story of the Label Maker

Imagine you have a big box of toy animals. You want to sort them into groups: Dogs, Cats, and Birds. You could write the name of each animal on a piece of paper… but that’s slow and messy!

What if you had a magic label maker? 🏷️

This label maker is special:

  • It only prints labels you tell it to make
  • It remembers all possible labels (even if you haven’t used some yet)
  • It can put labels in a special order (like “Small” before “Medium” before “Large”)

In R, this magic label maker is called a FACTOR.


🎯 What Are Factors?

A factor is R’s way of storing categories — things that belong to groups.

Think of sorting your toys:

  • Numbers are for counting: “I have 5 toys”
  • Text is for anything: “My favorite color is blue”
  • Factors are for groups: “This toy is a CAR, that toy is a DOLL”
# Regular text (character)
colors <- c("red", "blue", "red")

# Factor (special categories)
colors_factor <- factor(c("red", "blue", "red"))

Why use factors instead of text?

  • R knows all possible categories
  • R can put them in order
  • Faster for big data
  • Better for charts and analysis

📦 Creating Factors

Method 1: The Basic Way

Use the factor() function. It’s like using your label maker for the first time!

# Create a factor from animal types
pets <- factor(c("dog", "cat", "dog", "bird", "cat"))

print(pets)
# [1] dog  cat  dog  bird cat
# Levels: bird cat dog

See those “Levels”? Those are ALL the possible labels your factor knows about. R found them automatically!


Method 2: Tell R What Labels to Expect

Sometimes you know what labels SHOULD exist, even if you don’t have them all yet.

# Survey about T-shirt sizes
# We only got "small" and "large" responses
sizes <- factor(
  c("small", "large", "small"),
  levels = c("small", "medium", "large", "xlarge")
)

print(sizes)
# [1] small large small
# Levels: small medium large xlarge

Even though nobody picked “medium” or “xlarge”, R remembers they exist!


Method 3: Convert Existing Data

Already have text data? Turn it into a factor!

# Start with regular text
weather <- c("sunny", "rainy", "sunny", "cloudy")

# Convert to factor
weather_factor <- as.factor(weather)

print(weather_factor)
# [1] sunny  rainy  sunny  cloudy
# Levels: cloudy rainy sunny

📊 Factor Levels and Ordering

Understanding Levels

Levels are the complete list of possible categories.

Think of it like a menu at a restaurant:

  • The menu shows ALL items available
  • Your order shows what YOU picked
grades <- factor(c("A", "B", "A", "C"))

# See all levels (the menu)
levels(grades)
# [1] "A" "B" "C"

# Count how many levels
nlevels(grades)
# [1] 3

The Default Order Problem

By default, R puts levels in alphabetical order. But that’s not always what you want!

# T-shirt sizes
sizes <- factor(c("Medium", "Small", "Large"))

levels(sizes)
# [1] "Large" "Medium" "Small"
# 😱 Wrong order! Alphabetical, not size order!

This matters when you make charts — “Large” would appear before “Small”!


🎯 Creating Ordered Factors

Tell R the correct order using the levels argument:

# Create factor with correct order
sizes <- factor(
  c("Medium", "Small", "Large"),
  levels = c("Small", "Medium", "Large")
)

levels(sizes)
# [1] "Small" "Medium" "Large"
# ✅ Now in the right order!

Making Factors Truly Ordered (Ordinal)

For data where order means something (like ratings), use ordered = TRUE:

# Customer satisfaction ratings
rating <- factor(
  c("Good", "Bad", "Great", "Good"),
  levels = c("Bad", "OK", "Good", "Great"),
  ordered = TRUE
)

print(rating)
# [1] Good  Bad   Great Good
# Levels: Bad < OK < Good < Great

Now R understands that Great > Good > OK > Bad!

# You can even compare them!
rating[1] > rating[2]  # Is "Good" > "Bad"?
# [1] TRUE

🔧 Factor Manipulation

Changing Level Names

Rename your categories without changing the data:

# Original
status <- factor(c("Y", "N", "Y", "Y"))
levels(status)
# [1] "N" "Y"

# Rename levels
levels(status) <- c("No", "Yes")

print(status)
# [1] Yes No  Yes Yes
# Levels: No Yes

Important: The renaming follows the ORDER of levels!


Dropping Unused Levels

Sometimes you filter data and end up with empty categories:

# All animal types
animals <- factor(
  c("dog", "cat", "bird"),
  levels = c("dog", "cat", "bird", "fish")
)

# Keep only dogs
dogs <- animals[animals == "dog"]

levels(dogs)
# [1] "dog" "cat" "bird" "fish"
# 🤔 Fish still shows up even though we have none!

# Drop unused levels
dogs <- droplevels(dogs)

levels(dogs)
# [1] "dog"
# ✅ Clean!

Reordering Levels

Change the order of existing levels:

days <- factor(c("Mon", "Wed", "Fri"))
levels(days)
# [1] "Fri" "Mon" "Wed" (alphabetical 😕)

# Reorder properly
days <- factor(days,
  levels = c("Mon", "Wed", "Fri")
)

levels(days)
# [1] "Mon" "Wed" "Fri" ✅

Adding New Levels

Need to add categories that don’t exist yet?

fruits <- factor(c("apple", "banana"))
levels(fruits)
# [1] "apple" "banana"

# Add new levels
levels(fruits) <- c(levels(fruits), "cherry", "mango")

levels(fruits)
# [1] "apple" "banana" "cherry" "mango"

Combining Factors

Merge two factors together:

group1 <- factor(c("A", "B"))
group2 <- factor(c("C", "A"))

# Combine them
combined <- factor(c(
  as.character(group1),
  as.character(group2)
))

print(combined)
# [1] A B C A
# Levels: A B C

🧠 Quick Reference

graph TD A[Raw Data] --> B{factor function} B --> C[Factor Created] C --> D[Levels: All Categories] C --> E[Values: Your Data] D --> F[Can be Ordered] D --> G[Can be Renamed] D --> H[Can be Dropped]
Task Function Example
Create factor() factor(x)
See levels levels() levels(x)
Count levels nlevels() nlevels(x)
Make ordered ordered = TRUE factor(x, ordered=T)
Drop unused droplevels() droplevels(x)
Convert as.factor() as.factor(x)

🎉 You Did It!

You now understand Factors — R’s special way to handle categories!

Remember:

  • Factors store categories (not just text)
  • Levels are all possible categories
  • You can order levels to mean something
  • You can manipulate levels to fit your needs

Factors might seem tricky at first, but they’re incredibly powerful for data analysis. Every time you see survey responses, product categories, or ratings — think FACTORS! 🏷️

Loading story...

No Story Available

This concept doesn't have a story yet.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Interactive Preview

Interactive - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Interactive Content

This concept doesn't have interactive content yet.

Cheatsheet Preview

Cheatsheet - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Cheatsheet Available

This concept doesn't have a cheatsheet yet.

Quiz Preview

Quiz - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Quiz Available

This concept doesn't have a quiz yet.