🐼 Pandas DataFrames: Your Data’s Dream Home
The Big Picture
Imagine you have a giant filing cabinet. Each drawer is a column (like “Name” or “Age”). Each folder inside is a row (like one person’s info). That’s what a DataFrame is—a super-organized table that holds all your data in neat rows and columns!
🏠 What is a DataFrame?
Think of a DataFrame like a spreadsheet you see in Google Sheets or Excel. It has:
- Rows going sideways (like lines in a notebook)
- Columns going up and down (like labeled boxes)
Simple Example:
| Name | Age | City |
|---|---|---|
| Alice | 10 | New York |
| Bob | 8 | London |
| Charlie | 12 | Tokyo |
That’s a DataFrame! Three rows (Alice, Bob, Charlie) and three columns (Name, Age, City).
Real Life:
- A list of your classmates with their names and scores = DataFrame
- A shopping list with items and prices = DataFrame
- Your video game high scores = DataFrame
🎨 Creating DataFrames
There are many ways to build your data home. Let’s learn the most common ones!
Method 1: From a Dictionary 📚
A dictionary is like a labeled box. Each label (key) has items (values) inside.
import pandas as pd
data = {
'Name': ['Alice', 'Bob'],
'Age': [10, 8]
}
df = pd.DataFrame(data)
print(df)
Output:
Name Age
0 Alice 10
1 Bob 8
What happened?
- We made a dictionary with ‘Name’ and ‘Age’ as keys
- Each key became a column
- The values became the rows
Method 2: From a List of Lists 📝
import pandas as pd
data = [
['Alice', 10],
['Bob', 8]
]
df = pd.DataFrame(data,
columns=['Name', 'Age'])
print(df)
Output:
Name Age
0 Alice 10
1 Bob 8
What happened?
- Each inner list became a row
- We told pandas what to name the columns
Method 3: From a List of Dictionaries 🗂️
import pandas as pd
data = [
{'Name': 'Alice', 'Age': 10},
{'Name': 'Bob', 'Age': 8}
]
df = pd.DataFrame(data)
print(df)
Same result! Each dictionary is one row.
📊 DataFrame Columns
Columns are the vertical strips of your table. Think of them as categories or labels.
Getting Column Names
print(df.columns)
Output:
Index(['Name', 'Age'], dtype='object')
This tells you: “Hey, you have two columns called Name and Age!”
Selecting One Column
Want just the names? Easy!
names = df['Name']
print(names)
Output:
0 Alice
1 Bob
Name: Name, dtype: object
It’s like pulling out one drawer from your filing cabinet!
Selecting Multiple Columns
subset = df[['Name', 'Age']]
print(subset)
Use double brackets [[]] to grab more than one column.
🔢 DataFrame Index
The index is like the address for each row. By default, pandas gives rows numbers starting from 0.
Name Age
0 Alice 10 ← Row at index 0
1 Bob 8 ← Row at index 1
Why Start at 0?
Computers count starting from 0, not 1. It’s like how a building’s ground floor is sometimes called “Floor 0”!
Custom Index
You can name your rows too:
import pandas as pd
data = {'Score': [95, 87, 92]}
df = pd.DataFrame(data,
index=['Alice', 'Bob', 'Charlie'])
print(df)
Output:
Score
Alice 95
Bob 87
Charlie 92
Now instead of 0, 1, 2, we have names as addresses!
Getting the Index
print(df.index)
Output:
Index(['Alice', 'Bob', 'Charlie'],
dtype='object')
📐 DataFrame Shape
Ever wonder how big your data is? The shape tells you!
import pandas as pd
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [10, 8, 12],
'City': ['NY', 'LA', 'Tokyo']
}
df = pd.DataFrame(data)
print(df.shape)
Output:
(3, 3)
This means: 3 rows and 3 columns!
Understanding Shape
(rows, columns)
(3, 3) = 3 rows × 3 columns = 9 cells total!
It’s like measuring a rectangle:
- First number = height (rows)
- Second number = width (columns)
Quick Reference
| What You Type | What It Tells You |
|---|---|
df.shape |
(rows, columns) |
df.shape[0] |
Just the row count |
df.shape[1] |
Just the column count |
🎯 Putting It All Together
Here’s a complete example showing everything:
import pandas as pd
# Create DataFrame
data = {
'Fruit': ['Apple', 'Banana', 'Cherry'],
'Price': [1.0, 0.5, 2.0],
'Quantity': [10, 20, 15]
}
df = pd.DataFrame(data)
# Explore it!
print("DataFrame:")
print(df)
print("\nColumns:", df.columns.tolist())
print("Index:", df.index.tolist())
print("Shape:", df.shape)
Output:
DataFrame:
Fruit Price Quantity
0 Apple 1.0 10
1 Banana 0.5 20
2 Cherry 2.0 15
Columns: ['Fruit', 'Price', 'Quantity']
Index: [0, 1, 2]
Shape: (3, 3)
🧠 Quick Memory Tricks
graph TD A[DataFrame] --> B[Columns] A --> C[Index] A --> D[Shape] B --> E["df.columns<br/>Get column names"] C --> F["df.index<br/>Get row labels"] D --> G["df.shape<br/>rows, columns"]
Remember:
- 📊 DataFrame = Organized table with rows and columns
- 📝 Create it = From dictionaries, lists, or list of dicts
- 🏷️ Columns = Vertical labels (like drawer names)
- 🔢 Index = Row addresses (starts at 0)
- 📐 Shape = Size as (rows, columns)
🚀 You Did It!
Now you know how to:
- ✅ Understand what a DataFrame is
- ✅ Create DataFrames in different ways
- ✅ Access and understand columns
- ✅ Work with the index
- ✅ Check the shape of your data
You’re ready to wrangle data like a pro! 🐼✨