NumPy Essentials

Back

Loading concept...

NumPy Essentials: Your Data Superpower 🚀

Imagine you have a magical toolbox. Instead of hammers and screwdrivers, it holds powerful tools that can crunch millions of numbers in the blink of an eye. That toolbox is called NumPy—and today, you’re going to learn how to use it!


🎯 What is NumPy?

Think of NumPy like a super-organized shelf for your data.

Regular Python lists are like throwing toys into a messy box—you can store stuff, but finding and working with things is slow.

NumPy arrays are like a perfectly organized bookshelf—every item has its exact spot, and you can grab or change things lightning fast.

Why does this matter?

  • Data scientists use NumPy to analyze millions of data points
  • It’s 50x faster than regular Python lists
  • Every data tool (Pandas, TensorFlow, etc.) is built on NumPy

📦 NumPy Arrays: Your Data Containers

What is an Array?

An array is like a row of lockers. Each locker holds one item, and each has a number (starting from 0).

import numpy as np

# Creating your first array
my_locker = np.array([10, 20, 30, 40, 50])
print(my_locker)
# Output: [10 20 30 40 50]

Creating Arrays Different Ways

1. From a regular list:

grades = np.array([85, 92, 78, 95, 88])

2. Array of zeros (empty lockers):

empty = np.zeros(5)
# Output: [0. 0. 0. 0. 0.]

3. Array of ones:

ones = np.ones(4)
# Output: [1. 1. 1. 1.]

4. Range of numbers:

counting = np.arange(1, 6)
# Output: [1 2 3 4 5]

5. Evenly spaced numbers:

even_split = np.linspace(0, 10, 5)
# Output: [0. 2.5 5. 7.5 10.]

2D Arrays: The Grid

Imagine a spreadsheet or a tic-tac-toe board. That’s a 2D array!

grid = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])
graph TD A["2D Array"] --> B["Row 0: 1, 2, 3"] A --> C["Row 1: 4, 5, 6"] A --> D["Row 2: 7, 8, 9"]

Array Properties

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.shape)   # (2, 3) - 2 rows, 3 columns
print(arr.size)    # 6 total elements
print(arr.ndim)    # 2 dimensions
print(arr.dtype)   # int64 (data type)

🔢 Array Operations and Indexing

Getting Items: Indexing

Just like finding a book on a shelf, you use the position number.

Remember: Python counts from 0!

fruits = np.array(['apple', 'banana', 'cherry', 'date'])
#                    0         1         2        3

print(fruits[0])   # 'apple' (first item)
print(fruits[2])   # 'cherry' (third item)
print(fruits[-1])  # 'date' (last item)

Grabbing Multiple Items: Slicing

Think of slicing like cutting a piece of cake—you decide where to start and stop.

numbers = np.array([10, 20, 30, 40, 50, 60])

print(numbers[1:4])    # [20 30 40]
print(numbers[:3])     # [10 20 30] (first 3)
print(numbers[3:])     # [40 50 60] (from index 3)
print(numbers[::2])    # [10 30 50] (every 2nd)

2D Indexing: Row and Column

grid = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

print(grid[0, 0])      # 1 (row 0, column 0)
print(grid[1, 2])      # 6 (row 1, column 2)
print(grid[0, :])      # [1 2 3] (entire row 0)
print(grid[:, 1])      # [2 5 8] (entire column 1)
graph TD A["grid[1, 2]"] --> B["Row 1"] B --> C["Column 2"] C --> D["Result: 6"]

Boolean Indexing: Smart Filtering

This is like having a magical filter that only shows items matching your rules.

scores = np.array([45, 82, 67, 91, 55, 78])

# Find all scores above 70
high_scores = scores[scores > 70]
print(high_scores)   # [82 91 78]

# Find all passing scores (>= 60)
passing = scores[scores >= 60]
print(passing)       # [82 67 91 78]

Array Shape Operations

arr = np.array([1, 2, 3, 4, 5, 6])

# Reshape to 2x3 grid
reshaped = arr.reshape(2, 3)
print(reshaped)
# [[1 2 3]
#  [4 5 6]]

# Flatten back to 1D
flat = reshaped.flatten()
print(flat)   # [1 2 3 4 5 6]

âž• NumPy Mathematical Operations

Element-wise Operations

When you do math with arrays, it happens to every element automatically. It’s like having tiny calculators in each locker!

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

print(a + b)     # [11 22 33 44]
print(a - b)     # [-9 -18 -27 -36]
print(a * b)     # [10 40 90 160]
print(a / b)     # [0.1 0.1 0.1 0.1]
print(a ** 2)    # [1 4 9 16] (squared)

Broadcasting: The Magic Stretch

NumPy can stretch smaller arrays to match bigger ones. Like magic!

arr = np.array([1, 2, 3, 4, 5])

# Add 10 to every element
print(arr + 10)   # [11 12 13 14 15]

# Multiply all by 3
print(arr * 3)    # [3 6 9 12 15]

Built-in Math Functions

NumPy has special math functions that work on entire arrays:

numbers = np.array([1, 4, 9, 16, 25])

print(np.sqrt(numbers))   # [1. 2. 3. 4. 5.]
print(np.square(numbers)) # [1 16 81 256 625]
print(np.log(numbers))    # Natural log
print(np.exp(numbers))    # e^x

Trigonometry

angles = np.array([0, np.pi/2, np.pi])

print(np.sin(angles))  # [0. 1. 0.]
print(np.cos(angles))  # [1. 0. -1.]

Rounding Functions

decimals = np.array([1.2, 2.7, 3.5, 4.9])

print(np.round(decimals))  # [1. 3. 4. 5.]
print(np.floor(decimals))  # [1. 2. 3. 4.]
print(np.ceil(decimals))   # [2. 3. 4. 5.]

Aggregate Functions

These functions crunch all numbers into one result:

data = np.array([5, 10, 15, 20, 25])

print(np.sum(data))    # 75 (add all)
print(np.prod(data))   # 375000 (multiply all)
print(np.min(data))    # 5
print(np.max(data))    # 25

📊 NumPy Statistical Functions

Basic Statistics

scores = np.array([72, 85, 90, 78, 95, 88, 76])

print(np.mean(scores))    # 83.43 (average)
print(np.median(scores))  # 85.0 (middle value)
print(np.std(scores))     # 7.89 (spread)
print(np.var(scores))     # 62.24 (variance)
graph TD A["Data: 72, 85, 90, 78, 95, 88, 76"] A --> B["Mean: 83.43"] A --> C["Median: 85"] A --> D["Std Dev: 7.89"] A --> E["Range: 23"]

Understanding Mean, Median, Mode

Mean = Add everything, divide by count (the “average”)

Median = The middle number when sorted (not fooled by outliers!)

Example:

salaries = np.array([30, 35, 40, 45, 1000])

print(np.mean(salaries))    # 230 (skewed by 1000!)
print(np.median(salaries))  # 40 (true middle)

Percentiles and Quartiles

data = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])

print(np.percentile(data, 25))   # 27.5 (25th percentile)
print(np.percentile(data, 50))   # 55.0 (median)
print(np.percentile(data, 75))   # 77.5 (75th percentile)

Min, Max, and Range

temps = np.array([68, 72, 75, 71, 69, 80, 77])

print(np.min(temps))              # 68
print(np.max(temps))              # 80
print(np.ptp(temps))              # 12 (peak-to-peak range)
print(np.argmin(temps))           # 0 (index of min)
print(np.argmax(temps))           # 5 (index of max)

Statistical Operations on 2D Arrays

grades = np.array([
    [85, 90, 78],    # Student 1
    [92, 88, 95],    # Student 2
    [76, 82, 80]     # Student 3
])

# Mean of all grades
print(np.mean(grades))           # 85.11

# Mean per student (across columns)
print(np.mean(grades, axis=1))   # [84.33, 91.67, 79.33]

# Mean per subject (across rows)
print(np.mean(grades, axis=0))   # [84.33, 86.67, 84.33]

Correlation

See how two things relate to each other:

study_hours = np.array([2, 3, 4, 5, 6, 7, 8])
test_scores = np.array([60, 65, 70, 75, 85, 90, 95])

correlation = np.corrcoef(study_hours, test_scores)
print(correlation[0, 1])   # 0.99 (strong positive!)

Correlation Values:

  • Close to 1 = Strong positive (both go up together)
  • Close to -1 = Strong negative (one up, other down)
  • Close to 0 = No relationship

🎉 You Did It!

You’ve just learned the essentials of NumPy:

âś… Arrays - Your organized data containers âś… Indexing & Slicing - Finding and grabbing data âś… Math Operations - Lightning-fast calculations âś… Statistical Functions - Understanding your data

Next Steps:

  • Practice with real datasets
  • Explore Pandas (built on NumPy!)
  • Dive into data visualization

Remember: Every data scientist started exactly where you are now. Keep practicing, and these tools will become second nature! 🌟


NumPy is your superpower for data. Now go crunch some numbers! đź’Ş

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.