SQL Essentials

Back

Loading concept...

SQL Essentials: Your Data Adventure Begins! πŸ—„οΈ

Imagine you have a GIANT toy box with thousands of toys. How do you find your favorite red car quickly? That’s exactly what SQL helps computers do with data!


🏠 What Are Database Types?

The Library Analogy

Think of a database like a library. But not all libraries are organized the same way!

Simple Example:

  • Your toy box at home = Small database
  • A school library = Medium database
  • The biggest library in the world = Huge database

There are two main types of libraries (databases):


πŸ“Š Relational Databases (SQL)

What is it? Like a library with neat shelves, labeled sections, and a card catalog.

Everything is organized in tables (like spreadsheets with rows and columns).

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ toy_id  β”‚ toy_name β”‚ color   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ 1       β”‚ Car      β”‚ Red     β”‚
β”‚ 2       β”‚ Doll     β”‚ Pink    β”‚
β”‚ 3       β”‚ Ball     β”‚ Blue    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Popular Relational Databases:

  • MySQL - Free, used by Facebook
  • PostgreSQL - Super powerful, free
  • SQLite - Tiny, lives in one file
  • SQL Server - Microsoft’s database
  • Oracle - Used by big companies

Best for: Banks, stores, schools - anywhere data has clear structure!


πŸ“¦ Non-Relational Databases (NoSQL)

What is it? Like a messy (but searchable!) storage room. Flexible, no strict rules.

{
  "toy": "Car",
  "color": "Red",
  "wheels": 4,
  "batteries": false
}

Popular NoSQL Databases:

  • MongoDB - Stores data like documents
  • Redis - Super fast, like a notepad
  • Firebase - Great for phone apps

Best for: Social media, games, apps that change a lot!


graph TD A["DATABASE TYPES"] --> B["Relational SQL"] A --> C["Non-Relational NoSQL"] B --> D["Tables & Rows"] B --> E["Strict Rules"] C --> F["Flexible Documents"] C --> G["No Fixed Structure"]

πŸ” SQL for Data Science

The Magic Wand

SQL is like a magic wand that lets you ask questions to your data!

Instead of: β€œHey computer, can you please look through all million customers and find everyone who bought ice cream last summer?”

You write:

SELECT customer_name
FROM purchases
WHERE item = 'ice cream'
  AND purchase_date
  BETWEEN '2024-06-01'
  AND '2024-08-31';

The computer does the hard work in seconds!


πŸ§ͺ Why Data Scientists LOVE SQL

Task Without SQL With SQL
Find patterns Hours of scrolling 2 seconds
Clean data Copy-paste nightmare One command
Combine info Multiple files chaos Simple JOIN
Count things Manual counting COUNT(*)

πŸ“ˆ Essential SQL Commands for Data Science

1. SELECT - Pick what you want

SELECT name, age
FROM students;

Like saying: β€œShow me names and ages from the students list”

2. WHERE - Filter your results

SELECT name
FROM students
WHERE age > 10;

Like saying: β€œOnly show kids older than 10”

3. GROUP BY - Organize into groups

SELECT color, COUNT(*)
FROM toys
GROUP BY color;

Result: β€œRed: 5, Blue: 3, Green: 7”

4. ORDER BY - Sort your results

SELECT name, score
FROM students
ORDER BY score DESC;

Shows highest scores first!

5. Aggregate Functions - Math helpers

SELECT
  AVG(price) as average,
  MAX(price) as highest,
  MIN(price) as lowest,
  SUM(price) as total
FROM products;

πŸ”— SQL JOIN Types

The Party Invitation Problem

Imagine you have TWO lists:

  • List A: Kids in your class
  • List B: Kids invited to your party

How do you combine them?


🀝 INNER JOIN

What it does: Shows only kids who are in BOTH lists.

Class List:        Party List:        INNER JOIN Result:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Alice  β”‚        β”‚ Alice  β”‚         β”‚ Alice  β”‚
β”‚ Bob    │───────►│ Charlieβ”‚ ═══════►│ Charlieβ”‚
β”‚ Charlieβ”‚        β”‚ Emma   β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ David  β”‚        β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜
SELECT students.name
FROM students
INNER JOIN party_list
  ON students.name = party_list.name;

Result: Alice, Charlie (in both lists!)


⬅️ LEFT JOIN

What it does: Shows ALL kids from the left list, plus matches from right.

SELECT s.name, p.invited
FROM students s
LEFT JOIN party_list p
  ON s.name = p.name;

Result:

name invited
Alice Yes
Bob NULL
Charlie Yes
David NULL

Everyone from class, party status shown if invited


➑️ RIGHT JOIN

What it does: Shows ALL from the right list, plus matches from left.

SELECT s.name, p.invited
FROM students s
RIGHT JOIN party_list p
  ON s.name = p.name;

Everyone invited to party, class info if available


πŸ”„ FULL OUTER JOIN

What it does: Shows EVERYONE from BOTH lists!

SELECT s.name, p.name
FROM students s
FULL OUTER JOIN party_list p
  ON s.name = p.name;

Nobody left behind - everyone appears!


graph TD A["JOIN TYPES"] --> B["INNER JOIN"] A --> C["LEFT JOIN"] A --> D["RIGHT JOIN"] A --> E["FULL OUTER JOIN"] B --> F["Only matching rows"] C --> G["All left + matches"] D --> H["All right + matches"] E --> I["Everything from both"]

πŸ› οΈ SQL Query Techniques

Building Powerful Questions

Now let’s learn some ninja moves to make your SQL even better!


🎯 Technique 1: Subqueries

A query inside a query! Like asking a question to answer another question.

Example: Find students who scored above average

SELECT name, score
FROM students
WHERE score > (
  SELECT AVG(score)
  FROM students
);

First it calculates average, THEN finds who beat it!


πŸ“Š Technique 2: Window Functions

Do math WITHOUT shrinking your results!

SELECT
  name,
  score,
  RANK() OVER (ORDER BY score DESC)
    as position
FROM students;

Result:

name score position
Emma 95 1
Bob 88 2
Alice 85 3

Everyone keeps their row, but gets a rank!


🧹 Technique 3: CASE Statements

Add IF-THEN logic to your data!

SELECT
  name,
  score,
  CASE
    WHEN score >= 90 THEN 'A'
    WHEN score >= 80 THEN 'B'
    WHEN score >= 70 THEN 'C'
    ELSE 'Needs Help'
  END as grade
FROM students;

Automatically assigns grades!


πŸ”’ Technique 4: CTEs (Common Table Expressions)

Break complex queries into simple steps!

WITH high_scorers AS (
  SELECT name, score
  FROM students
  WHERE score > 80
)
SELECT *
FROM high_scorers
ORDER BY score DESC;

Like creating a temporary mini-table!


πŸ“‹ Technique 5: NULL Handling

NULLs are tricky! They mean β€œunknown” not β€œzero”.

-- Find missing data
SELECT name
FROM students
WHERE email IS NULL;

-- Replace NULLs
SELECT
  name,
  COALESCE(phone, 'No phone')
    as contact
FROM students;

COALESCE picks the first non-NULL value!


⚑ Technique 6: DISTINCT

Remove duplicates!

SELECT DISTINCT city
FROM customers;

Only shows each city ONCE, no matter how many customers!


πŸŽ“ Putting It All Together

Here’s a real-world example combining everything:

-- Find top 3 selling products
-- by category last month

WITH monthly_sales AS (
  SELECT
    p.category,
    p.product_name,
    SUM(s.quantity) as total_sold
  FROM products p
  INNER JOIN sales s
    ON p.id = s.product_id
  WHERE s.sale_date >= '2024-11-01'
  GROUP BY p.category, p.product_name
)
SELECT
  category,
  product_name,
  total_sold,
  RANK() OVER (
    PARTITION BY category
    ORDER BY total_sold DESC
  ) as rank_in_category
FROM monthly_sales
WHERE total_sold > 0
ORDER BY category, rank_in_category;

🌟 Key Takeaways

  1. Databases are organized storage - SQL for structured, NoSQL for flexible
  2. SQL lets you ask questions to millions of rows instantly
  3. JOINs combine tables like puzzle pieces
  4. Query techniques make you a data ninja!

Remember: Every data scientist started by writing their first SELECT * FROM table; - and look where it took them! Now it’s YOUR turn! πŸš€

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.