🧵 R Strings & Patterns: Weaving Words Like Magic!
Imagine you’re a word wizard with a magical loom. Every string is a colorful thread, and R gives you the tools to cut, stretch, twist, and weave these threads into beautiful patterns!
🎯 What You’ll Master
Think of strings as friendship bracelets made of letters. Today, you’ll learn to:
- Tie strings together (concatenation)
- Measure and snip them (length & extraction)
- Dress them up nice (formatting)
- Change their voice (case conversion)
- Cut and swap pieces (splitting & replacement)
- Find hidden patterns (pattern matching)
- Create search spells (regular expressions)
1. 🔗 String Concatenation: Tying Strings Together
What Is It?
Concatenation means gluing words together — like linking train cars!
The Magic Spells
paste() — The Friendly Glue
# Joins with a space by default
paste("Hello", "World")
# Result: "Hello World"
# Use sep to change the glue
paste("2024", "12", "25", sep = "-")
# Result: "2024-12-25"
paste0() — The Super Glue (No Gaps!)
paste0("Super", "Hero")
# Result: "SuperHero"
paste0("R", "ocks!")
# Result: "Rocks!"
cat() — The Printer
cat("I", "love", "R!", "\n")
# Prints: I love R!
🎨 Real-Life Example
name <- "Luna"
age <- 8
paste(name, "is", age, "years old!")
# "Luna is 8 years old!"
2. 📏 String Length and Extraction: Measuring & Snipping
Finding Length with nchar()
How many letters are in your word?
nchar("butterfly")
# Result: 9
nchar("R")
# Result: 1
Extracting Pieces with substr()
Cut out a piece like cutting a ribbon!
# substr(text, start, stop)
substr("RAINBOW", 1, 3)
# Result: "RAI"
substr("RAINBOW", 5, 7)
# Result: "BOW"
🎨 Real-Life Example
phone <- "555-123-4567"
area_code <- substr(phone, 1, 3)
# area_code = "555"
3. ✨ String Formatting: Dressing Up Your Words
sprintf() — The Fashion Designer
Make your strings look fancy with placeholders!
| Placeholder | Meaning | Example |
|---|---|---|
%s |
String | “hello” |
%d |
Integer | 42 |
%f |
Decimal | 3.14 |
%.2f |
2 decimals | 3.14 |
sprintf("I have %d apples", 5)
# "I have 5 apples"
sprintf("Pi is %.2f", 3.14159)
# "Pi is 3.14"
sprintf("%s scored %d points!", "Max", 100)
# "Max scored 100 points!"
format() — The Organizer
format(12345.6, big.mark = ",")
# "12,345.6"
format(7, width = 3, justify = "right")
# " 7"
4. 🔄 String Case Conversion: Changing the Voice
The Volume Controls
Think of these as whisper and SHOUT buttons!
# SHOUT! (all uppercase)
toupper("hello world")
# "HELLO WORLD"
# whisper (all lowercase)
tolower("HELLO WORLD")
# "hello world"
Using tools::toTitleCase()
Perfect for Book Titles
tools::toTitleCase("the little prince")
# "The Little Prince"
🎨 Real-Life Example
user_input <- "JoHn SmItH"
clean_name <- tools::toTitleCase(
tolower(user_input)
)
# "John Smith"
5. ✂️ String Splitting & Replacement: Cut and Swap
strsplit() — The Scissors
Chop a string into pieces!
strsplit("apple,banana,cherry", ",")
# Returns list: "apple" "banana" "cherry"
strsplit("Hello World", " ")
# Returns list: "Hello" "World"
gsub() — Find and Replace All
Swap every match!
gsub("cat", "dog", "I love my cat cat")
# "I love my dog dog"
sub() — Replace First Match Only
sub("cat", "dog", "cat cat cat")
# "dog cat cat"
🎨 Real-Life Example
messy <- "hello...world...R"
clean <- gsub("\\.\\.\\.", " ", messy)
# "hello world R"
6. 🔍 Pattern Matching Functions: The Detective Tools
grep() — Find Which Items Match
Returns positions of matches
fruits <- c("apple", "banana", "apricot", "cherry")
grep("ap", fruits)
# Result: 1 3 (positions of apple, apricot)
grepl() — True or False Detective
Returns TRUE/FALSE for each item
grepl("ap", fruits)
# TRUE FALSE TRUE FALSE
regexpr() — Where’s the Match?
Tells you exactly where the pattern starts
regexpr("an", "banana")
# Returns 2 (match starts at position 2)
🔎 Comparison Chart
graph TD A[Pattern Matching] --> B[grep] A --> C[grepl] A --> D[regexpr] B --> E[Returns: Positions] C --> F[Returns: TRUE/FALSE] D --> G[Returns: Match Location]
7. 🧙♂️ Regular Expression Basics: The Magic Spells
What Are Regular Expressions?
Regex = Super-powered search patterns!
Think of them as search wildcards on steroids.
The Essential Magic Symbols
| Symbol | Meaning | Example |
|---|---|---|
. |
Any single character | c.t matches “cat”, “cot” |
* |
Zero or more | ab*c matches “ac”, “abc”, “abbc” |
+ |
One or more | ab+c matches “abc”, “abbc” |
? |
Zero or one | colou?r matches “color”, “colour” |
^ |
Start of string | ^Hello matches “Hello world” |
$ |
End of string | world$ matches “Hello world” |
[abc] |
Any char in set | [aeiou] matches vowels |
[0-9] |
Digit range | Matches any digit |
\\d |
Any digit | Same as [0-9] |
\\w |
Word character | Letters, digits, underscore |
\\s |
Whitespace | Space, tab, newline |
Examples in Action
# Find strings starting with "A"
grep("^A", c("Apple", "Banana", "Avocado"))
# Result: 1 3
# Find strings ending with numbers
grepl("[0-9]quot;, c("Room101", "Kitchen", "Floor2"))
# TRUE FALSE TRUE
# Match email-like patterns
email <- "test@email.com"
grepl("\\w+@\\w+\\.\\w+", email)
# TRUE
🎨 Real-Life Example
# Extract all phone numbers
texts <- c("Call 555-1234", "No number", "Dial 123-5678")
grep("\\d{3}-\\d{4}", texts)
# Result: 1 3
🗺️ The Complete String Journey
graph TD A[Raw String] --> B[Concatenate] B --> C[Format/Style] C --> D[Change Case] D --> E[Split if Needed] E --> F[Search Patterns] F --> G[Replace/Extract] G --> H[Clean Result!]
🎁 Quick Reference Summary
| Task | Function | Example |
|---|---|---|
| Join strings | paste() |
paste("Hi", "there") |
| Join without spaces | paste0() |
paste0("Super", "man") |
| Count characters | nchar() |
nchar("hello") → 5 |
| Extract piece | substr() |
substr("hello", 1, 2) → “he” |
| Format numbers | sprintf() |
sprintf("%d", 42) |
| UPPERCASE | toupper() |
toupper("hi") → “HI” |
| lowercase | tolower() |
tolower("HI") → “hi” |
| Split string | strsplit() |
strsplit("a,b", ",") |
| Replace all | gsub() |
gsub("a", "b", "aaa") → “bbb” |
| Find matches | grep() |
Returns positions |
| Test pattern | grepl() |
Returns TRUE/FALSE |
🚀 You Did It!
You’ve just learned to:
- ✅ Glue strings together with
paste()andpaste0() - ✅ Measure and cut with
nchar()andsubstr() - ✅ Format beautifully with
sprintf() - ✅ Change case with
toupper()andtolower() - ✅ Split and replace with
strsplit(),gsub(),sub() - ✅ Hunt patterns with
grep(),grepl(),regexpr() - ✅ Cast regex spells with special symbols!
You’re now a String Wizard! 🧙♂️✨