π¨ Statistical Plots: Distribution Plots
Seeing How Your Data Spreads Out β Like Watching Water Flow!
π The Big Picture: What Are Distribution Plots?
Imagine you have a jar full of colorful marbles. Some colors appear more often than others. Distribution plots are like magical X-ray glasses that show you:
- Which values appear most often (the popular kids!)
- Which values are rare (the shy ones hiding in corners)
- How your data spreads out from low to high
Think of it like watching water flow down a hill β distribution plots show you where the water pools (common values) and where it barely trickles (rare values).
π Density Plot: The Smooth Mountain of Data
What Is a Density Plot?
A density plot is like drawing a smooth hill over your data. Instead of showing choppy bars like a histogram, it creates a beautiful flowing curve.
Simple Analogy:
- Histogram = Stacking blocks (chunky, blocky)
- Density Plot = Drawing a smooth blanket over those blocks (flowy, pretty!)
Why Use It?
- Shows the shape of your data clearly
- Great for comparing multiple groups
- No need to pick βbin sizesβ like histograms
π How to Create One
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
# Your data (test scores)
scores = [72, 85, 78, 92, 88, 75, 95, 82, 79, 91]
# Create the smooth curve
density = stats.gaussian_kde(scores)
x_range = np.linspace(60, 100, 200)
plt.fill_between(x_range, density(x_range), alpha=0.5)
plt.plot(x_range, density(x_range), linewidth=2)
plt.xlabel('Test Scores')
plt.ylabel('Density')
plt.title('Score Distribution')
plt.show()
π― Key Points
| Feature | What It Means |
|---|---|
| Peak | Most common value |
| Width | How spread out data is |
| Tail | Rare extreme values |
π§ββοΈ Pro Tip
Higher peaks = more data points clustered there. Think of it as βmore people standing on that spot!β
π ECDF Plot: The Climbing Staircase
What Is ECDF?
ECDF stands for Empirical Cumulative Distribution Function. Donβt let the fancy name scare you!
Simple Explanation: Itβs like counting how many people are shorter than you as you walk through a line sorted by height.
The Staircase Analogy πͺ
Imagine everyone in your class stands in a line from shortest to tallest. As you walk past each person:
- 1 person passed = 10% done
- 5 people passed = 50% done
- Everyone passed = 100% done
The ECDF shows this as a climbing staircase!
π How to Create One
import matplotlib.pyplot as plt
import numpy as np
# Heights of 10 students (in cm)
heights = [150, 155, 162, 165, 168, 170, 172, 178, 180, 185]
# Sort the data
sorted_heights = np.sort(heights)
# Create y-values (proportions)
y = np.arange(1, len(heights) + 1) / len(heights)
plt.step(sorted_heights, y, where='post', linewidth=2)
plt.xlabel('Height (cm)')
plt.ylabel('Proportion of Students')
plt.title('ECDF of Student Heights')
plt.grid(True, alpha=0.3)
plt.show()
π― Reading the ECDF
| Y-Value | Meaning |
|---|---|
| 0.5 | 50% of data is below this X value |
| 0.9 | 90% of data is below this X value |
| 1.0 | Youβve seen ALL the data |
πͺ Magic Trick
To find the median (middle value), look where the line crosses y = 0.5 and trace down to the x-axis!
graph TD A["Start at y=0.5"] --> B["Draw line across"] B --> C["Find where it hits ECDF"] C --> D["Drop down to x-axis"] D --> E[That's your median!]
π₯§ Pie Chart Basics: Slicing the Pizza!
What Is a Pie Chart?
A pie chart is exactly what it sounds like β a circle divided into slices, just like a pizza! π
Each slice shows what fraction of the whole each category represents.
When to Use Pie Charts
β Good for:
- Showing parts of a whole
- When you have 2-5 categories
- Percentages that add up to 100%
β Bad for:
- Many categories (too many tiny slices!)
- Comparing precise values
- Data that doesnβt represent βparts of a wholeβ
π How to Create One
import matplotlib.pyplot as plt
# Favorite fruits in class
fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
plt.pie(votes, labels=fruits)
plt.title('Favorite Fruits in Class')
plt.show()
π― The Basic Recipe
| Ingredient | What It Does |
|---|---|
| Data values | Size of each slice |
| Labels | Name of each slice |
| Colors | Auto-assigned (or custom) |
π¨ Pie Chart Customization: Make It Beautiful!
Now letβs turn that plain pie into a masterpiece!
Adding Percentages
Show exactly how big each slice is:
import matplotlib.pyplot as plt
fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
plt.pie(votes, labels=fruits, autopct='%1.1f%%')
plt.title('Favorite Fruits')
plt.show()
The autopct='%1.1f%%' magic adds percentages with 1 decimal place!
Exploding a Slice π₯
Want to highlight the winner? Make it βpop outβ:
import matplotlib.pyplot as plt
fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
explode = [0.1, 0, 0, 0] # Push Apples out
plt.pie(votes, labels=fruits, explode=explode,
autopct='%1.1f%%')
plt.title('Apples Win!')
plt.show()
Custom Colors π
Pick your own color palette:
import matplotlib.pyplot as plt
fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
colors = ['#ff6b6b', '#ffd93d', '#ff9f43', '#6c5ce7']
plt.pie(votes, labels=fruits, colors=colors,
autopct='%1.1f%%', shadow=True)
plt.title('Colorful Fruit Chart')
plt.show()
Adding a Shadow
The shadow=True parameter adds a subtle 3D effect!
π― Customization Cheat Table
| Parameter | What It Does | Example |
|---|---|---|
autopct |
Show percentages | '%1.1f%%' |
explode |
Pop out slices | [0.1, 0, 0, 0] |
colors |
Custom colors | ['red', 'blue'] |
shadow |
3D shadow effect | True |
startangle |
Rotate the pie | 90 |
π¨ Start Angle Magic
Change where the first slice begins:
# Start from the top (12 o'clock position)
plt.pie(votes, labels=fruits, startangle=90)
graph TD A["Basic Pie"] --> B["Add Labels"] B --> C["Add Percentages"] C --> D["Add Colors"] D --> E["Add Explode"] E --> F["Add Shadow"] F --> G["Beautiful Chart! π"]
π Quick Summary
| Plot Type | Best For | Think Of It As |
|---|---|---|
| Density Plot | Smooth data shape | A blanket over your data |
| ECDF Plot | Cumulative view | Climbing stairs |
| Pie Chart | Parts of whole | Pizza slices |
| Customized Pie | Highlighted insights | Fancy decorated pizza |
π You Did It!
You now know four powerful ways to visualize how data is distributed:
- Density plots show smooth curves of where data clusters
- ECDF plots show cumulative stepping stones through your data
- Pie charts slice your data like pizza
- Customization makes your charts pop and communicate clearly
Remember: The best visualization is the one that tells your story clearly. Now go make some beautiful charts! πβ¨
