What are quantiles in R?

Quantiles are signposts showing where values stand in ordered data. The 25th percentile means 25% of values are below it.

What does correlation measure in R?

Correlation measures if two variables move together. Values range from -1 (opposite) to +1 (same direction), with 0 meaning no relationship.

What's the difference between correlation and covariance?

Correlation is always between -1 and +1, making it easy to interpret. Covariance uses original units, making comparison harder.

Distribution Analysis in R | Data Analysis Guide

📊 Distribution Analysis in R: Your Data Detective Toolkit

Imagine you’re a detective. Your job? To understand the story hidden inside numbers. Today, we’ll learn three super-powers that help us crack the case: Quantiles, Correlation, and Scaling!

🎯 The Big Picture

Think of your data like a classroom of kids standing in a line from shortest to tallest.

Quantiles help you find who’s in the middle, who’s really tall, and who’s really short.
Correlation tells you if tall kids also tend to have big feet (do two things go together?).
Scaling is like converting everyone’s height to the same measuring tape so we can compare fairly.

📏 Part 1: Quantile Functions — Finding the Landmarks

What Are Quantiles?

Imagine 100 kids standing in a line from shortest to tallest. Quantiles are like signposts telling you where you are in the line.

25th percentile (Q1): 25 kids are shorter than you
50th percentile (Median): You’re right in the middle!
75th percentile (Q3): 75 kids are shorter than you

🧪 R Example: Finding Quantiles

# Heights of 10 students (in cm)
heights <- c(120, 125, 130, 135, 140,
             145, 150, 155, 160, 165)

# Find the median (50th percentile)
median(heights)
# Result: 142.5

# Find Q1, Median, Q3
quantile(heights, c(0.25, 0.50, 0.75))
#   25%   50%   75%
# 128.75 142.5 156.25

📦 The quantile() Function

quantile(x, probs)

x = your data
probs = which percentiles you want (0 to 1)

🎁 Special Quantiles You’ll Use Often

Name	Percentile	R Code
Minimum	0%	`quantile(x, 0)`
Q1	25%	`quantile(x, 0.25)`
Median	50%	`quantile(x, 0.50)`
Q3	75%	`quantile(x, 0.75)`
Maximum	100%	`quantile(x, 1)`

🔍 Quick Summary with fivenum()

fivenum(heights)
# [1] 120 128.75 142.5 156.25 165

This gives you: Min, Q1, Median, Q3, Max — all at once!

🔗 Part 2: Correlation and Covariance — Do Things Move Together?

The Story

Imagine two friends: Height and Shoe Size. When one friend gets bigger, does the other friend also get bigger? That’s what correlation measures!

Three Types of Relationships

graph TD
    A["Two Variables"] --> B["Positive Correlation"]
    A --> C["Negative Correlation"]
    A --> D["No Correlation"]
    B --> E["📈 Both go up together"]
    C --> F["📉 One up, other down"]
    D --> G["🎲 No pattern"]

🧪 R Example: Correlation

# Heights and shoe sizes of 5 people
height <- c(150, 160, 170, 180, 190)
shoe_size <- c(6, 7, 8, 9, 10)

# Calculate correlation
cor(height, shoe_size)
# Result: 1 (perfect positive!)

Understanding Correlation Values

Value	Meaning	Example
+1	Perfect positive	Height ↔ Shoe size
0	No relationship	Shoe size ↔ Favorite color
-1	Perfect negative	Speed ↔ Travel time

🎯 Covariance: Correlation’s Cousin

Covariance also measures if things move together, but it’s in the original units (harder to interpret).

# Calculate covariance
cov(height, shoe_size)
# Result: 50

# Correlation is easier to understand!
# It's always between -1 and +1

When to Use Which?

Correlation (cor()): When you want to know HOW STRONG the relationship is (always -1 to +1)
Covariance (cov()): When you need the actual units (used in advanced math)

📊 Correlation Matrix: Many Variables at Once

# Three variables
age <- c(25, 30, 35, 40, 45)
income <- c(30, 45, 60, 75, 90)
savings <- c(5, 12, 20, 30, 42)

# Create a data frame
data <- data.frame(age, income, savings)

# See all correlations at once!
cor(data)
#           age   income  savings
# age      1.00    1.00    1.00
# income   1.00    1.00    1.00
# savings  1.00    1.00    1.00

⚖️ Part 3: Scaling Data — Making Fair Comparisons

The Problem

Imagine comparing:

A test score: 85 out of 100
A race time: 12 seconds
A weight: 50 kg

How do you compare these? They’re all in different units! Scaling puts everything on the same measuring stick.

Two Popular Scaling Methods

graph TD
    A["Scaling Methods"] --> B["Z-Score / Standardization"]
    A --> C["Min-Max Normalization"]
    B --> D["Mean = 0, SD = 1"]
    C --> E["Range 0 to 1"]

🧪 Z-Score Scaling with scale()

The Z-score tells you: “How many steps away from average are you?”

# Test scores
scores <- c(70, 80, 90, 100, 110)

# Scale the data
scaled_scores <- scale(scores)
print(scaled_scores)
#            [,1]
# [1,] -1.2649111
# [2,] -0.6324555
# [3,]  0.0000000
# [4,]  0.6324555
# [5,]  1.2649111

Understanding Z-Scores

Z-Score	What It Means
0	Exactly average
+1	One step above average
-1	One step below average
+2	Very high (rare!)
-2	Very low (rare!)

🎯 Min-Max Scaling: 0 to 1

This squishes all values between 0 and 1.

# Custom min-max function
min_max <- function(x) {
  (x - min(x)) / (max(x) - min(x))
}

# Apply it
scores <- c(70, 80, 90, 100, 110)
min_max(scores)
# [1] 0.00 0.25 0.50 0.75 1.00

When to Use Each?

Method	Best For	Example
Z-Score	Comparing how unusual values are	“How far from average?”
Min-Max	When you need 0-1 range	Machine learning inputs

🎯 Quick Reference: All Functions

# QUANTILES
quantile(x, 0.5)    # Get median
quantile(x, c(0.25, 0.75))  # Get Q1 and Q3
fivenum(x)          # Min, Q1, Median, Q3, Max

# CORRELATION & COVARIANCE
cor(x, y)           # Correlation (-1 to +1)
cov(x, y)           # Covariance (original units)
cor(data_frame)     # Correlation matrix

# SCALING
scale(x)            # Z-score standardization
# Custom min-max: (x - min) / (max - min)

🏆 You Did It!

You just learned three powerful detective tools:

Quantiles: Find where values stand in the lineup
Correlation: Discover if things move together
Scaling: Put everything on the same measuring stick

Now you can analyze any dataset like a pro! 🎉

🧠 Remember This Story

A teacher wanted to understand her class better. She used quantiles to find who scored in the top 25%. She checked correlation to see if study hours predicted test scores. Finally, she used scaling to fairly compare math and art scores. The end!

Distribution Analysis

Unable to load concept

Coming Soon...

📊 Distribution Analysis in R: Your Data Detective Toolkit

🎯 The Big Picture

📏 Part 1: Quantile Functions — Finding the Landmarks

What Are Quantiles?

🧪 R Example: Finding Quantiles

📦 The quantile() Function

🎁 Special Quantiles You’ll Use Often

🔍 Quick Summary with fivenum()

🔗 Part 2: Correlation and Covariance — Do Things Move Together?

The Story

Three Types of Relationships

🧪 R Example: Correlation

Understanding Correlation Values

🎯 Covariance: Correlation’s Cousin

When to Use Which?

📊 Correlation Matrix: Many Variables at Once

⚖️ Part 3: Scaling Data — Making Fair Comparisons

The Problem

Two Popular Scaling Methods

🧪 Z-Score Scaling with scale()

Understanding Z-Scores

🎯 Min-Max Scaling: 0 to 1

When to Use Each?

🎯 Quick Reference: All Functions

🏆 You Did It!

🧠 Remember This Story

Story - Premium Content

Stay Tuned!

Story - Premium Content

Interactive - Premium Content

Interactive - Premium Content

Stay Tuned!

Cheatsheet - Premium Content

Cheatsheet - Premium Content

Stay Tuned!

Quiz - Premium Content

Quiz - Premium Content

Stay Tuned!

Flashcard - Premium Content

Flashcard - Premium Content

Stay Tuned!

Sign in Required

Report an Issue