Distribution Plots

Back

Loading concept...

🎨 Statistical Plots: Distribution Plots

Seeing How Your Data Spreads Out β€” Like Watching Water Flow!


🌊 The Big Picture: What Are Distribution Plots?

Imagine you have a jar full of colorful marbles. Some colors appear more often than others. Distribution plots are like magical X-ray glasses that show you:

  • Which values appear most often (the popular kids!)
  • Which values are rare (the shy ones hiding in corners)
  • How your data spreads out from low to high

Think of it like watching water flow down a hill β€” distribution plots show you where the water pools (common values) and where it barely trickles (rare values).


🌈 Density Plot: The Smooth Mountain of Data

What Is a Density Plot?

A density plot is like drawing a smooth hill over your data. Instead of showing choppy bars like a histogram, it creates a beautiful flowing curve.

Simple Analogy:

  • Histogram = Stacking blocks (chunky, blocky)
  • Density Plot = Drawing a smooth blanket over those blocks (flowy, pretty!)

Why Use It?

  • Shows the shape of your data clearly
  • Great for comparing multiple groups
  • No need to pick β€œbin sizes” like histograms

🐍 How to Create One

import matplotlib.pyplot as plt
import numpy as np
from scipy import stats

# Your data (test scores)
scores = [72, 85, 78, 92, 88, 75, 95, 82, 79, 91]

# Create the smooth curve
density = stats.gaussian_kde(scores)
x_range = np.linspace(60, 100, 200)

plt.fill_between(x_range, density(x_range), alpha=0.5)
plt.plot(x_range, density(x_range), linewidth=2)
plt.xlabel('Test Scores')
plt.ylabel('Density')
plt.title('Score Distribution')
plt.show()

🎯 Key Points

Feature What It Means
Peak Most common value
Width How spread out data is
Tail Rare extreme values

πŸ§™β€β™‚οΈ Pro Tip

Higher peaks = more data points clustered there. Think of it as β€œmore people standing on that spot!”


πŸ“ˆ ECDF Plot: The Climbing Staircase

What Is ECDF?

ECDF stands for Empirical Cumulative Distribution Function. Don’t let the fancy name scare you!

Simple Explanation: It’s like counting how many people are shorter than you as you walk through a line sorted by height.

The Staircase Analogy πŸͺœ

Imagine everyone in your class stands in a line from shortest to tallest. As you walk past each person:

  • 1 person passed = 10% done
  • 5 people passed = 50% done
  • Everyone passed = 100% done

The ECDF shows this as a climbing staircase!

🐍 How to Create One

import matplotlib.pyplot as plt
import numpy as np

# Heights of 10 students (in cm)
heights = [150, 155, 162, 165, 168, 170, 172, 178, 180, 185]

# Sort the data
sorted_heights = np.sort(heights)

# Create y-values (proportions)
y = np.arange(1, len(heights) + 1) / len(heights)

plt.step(sorted_heights, y, where='post', linewidth=2)
plt.xlabel('Height (cm)')
plt.ylabel('Proportion of Students')
plt.title('ECDF of Student Heights')
plt.grid(True, alpha=0.3)
plt.show()

🎯 Reading the ECDF

Y-Value Meaning
0.5 50% of data is below this X value
0.9 90% of data is below this X value
1.0 You’ve seen ALL the data

πŸͺ„ Magic Trick

To find the median (middle value), look where the line crosses y = 0.5 and trace down to the x-axis!

graph TD A["Start at y=0.5"] --> B["Draw line across"] B --> C["Find where it hits ECDF"] C --> D["Drop down to x-axis"] D --> E[That's your median!]

πŸ₯§ Pie Chart Basics: Slicing the Pizza!

What Is a Pie Chart?

A pie chart is exactly what it sounds like β€” a circle divided into slices, just like a pizza! πŸ•

Each slice shows what fraction of the whole each category represents.

When to Use Pie Charts

βœ… Good for:

  • Showing parts of a whole
  • When you have 2-5 categories
  • Percentages that add up to 100%

❌ Bad for:

  • Many categories (too many tiny slices!)
  • Comparing precise values
  • Data that doesn’t represent β€œparts of a whole”

🐍 How to Create One

import matplotlib.pyplot as plt

# Favorite fruits in class
fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]

plt.pie(votes, labels=fruits)
plt.title('Favorite Fruits in Class')
plt.show()

🎯 The Basic Recipe

Ingredient What It Does
Data values Size of each slice
Labels Name of each slice
Colors Auto-assigned (or custom)

🎨 Pie Chart Customization: Make It Beautiful!

Now let’s turn that plain pie into a masterpiece!

Adding Percentages

Show exactly how big each slice is:

import matplotlib.pyplot as plt

fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]

plt.pie(votes, labels=fruits, autopct='%1.1f%%')
plt.title('Favorite Fruits')
plt.show()

The autopct='%1.1f%%' magic adds percentages with 1 decimal place!

Exploding a Slice πŸ’₯

Want to highlight the winner? Make it β€œpop out”:

import matplotlib.pyplot as plt

fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
explode = [0.1, 0, 0, 0]  # Push Apples out

plt.pie(votes, labels=fruits, explode=explode,
        autopct='%1.1f%%')
plt.title('Apples Win!')
plt.show()

Custom Colors 🌈

Pick your own color palette:

import matplotlib.pyplot as plt

fruits = ['Apples', 'Bananas', 'Oranges', 'Grapes']
votes = [8, 5, 4, 3]
colors = ['#ff6b6b', '#ffd93d', '#ff9f43', '#6c5ce7']

plt.pie(votes, labels=fruits, colors=colors,
        autopct='%1.1f%%', shadow=True)
plt.title('Colorful Fruit Chart')
plt.show()

Adding a Shadow

The shadow=True parameter adds a subtle 3D effect!

🎯 Customization Cheat Table

Parameter What It Does Example
autopct Show percentages '%1.1f%%'
explode Pop out slices [0.1, 0, 0, 0]
colors Custom colors ['red', 'blue']
shadow 3D shadow effect True
startangle Rotate the pie 90

🎨 Start Angle Magic

Change where the first slice begins:

# Start from the top (12 o'clock position)
plt.pie(votes, labels=fruits, startangle=90)
graph TD A["Basic Pie"] --> B["Add Labels"] B --> C["Add Percentages"] C --> D["Add Colors"] D --> E["Add Explode"] E --> F["Add Shadow"] F --> G["Beautiful Chart! πŸŽ‰"]

🌟 Quick Summary

Plot Type Best For Think Of It As
Density Plot Smooth data shape A blanket over your data
ECDF Plot Cumulative view Climbing stairs
Pie Chart Parts of whole Pizza slices
Customized Pie Highlighted insights Fancy decorated pizza

πŸš€ You Did It!

You now know four powerful ways to visualize how data is distributed:

  1. Density plots show smooth curves of where data clusters
  2. ECDF plots show cumulative stepping stones through your data
  3. Pie charts slice your data like pizza
  4. Customization makes your charts pop and communicate clearly

Remember: The best visualization is the one that tells your story clearly. Now go make some beautiful charts! πŸ“Šβœ¨

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.