Word Representations

Back

Loading concept...

🧠 NLP - Word Representations: Teaching Computers to Understand Words


The Big Picture: How Do We Teach Computers to Read?

Imagine you’re trying to explain the word “dog” to a robot. The robot has never seen a dog. It doesn’t know dogs bark, wag tails, or love belly rubs. To a computer, “dog” is just three letters: D-O-G.

But here’s the magic question: How do we help computers understand that “dog” and “puppy” are similar? That “king” is to “queen” like “man” is to “woman”?

This is the story of Word Representations — the art of turning words into numbers that capture meaning.


🎯 What Are Word Embeddings?

The Dictionary Problem

Think about your favorite dictionary. It lists words alphabetically. But “cat” and “dog” are far apart (C vs D), even though they’re both pets!

Word Embeddings fix this.

The Magical Number List

A word embedding is like giving every word a secret address — a list of numbers that describes where it lives in “meaning space.”

Simple Example:

"cat"  → [0.2, 0.8, 0.1, 0.9]
"dog"  → [0.3, 0.7, 0.2, 0.8]
"car"  → [0.9, 0.1, 0.8, 0.2]

Notice: Cat and dog have similar numbers. Car is very different!

Why This Matters

Imagine you’re organizing a giant birthday party. You group:

  • Pets in one corner (cat, dog, hamster)
  • Vehicles in another (car, bus, train)
  • Foods near the kitchen (pizza, burger, cake)

Word embeddings do the same thing — they group similar words close together in number-space!

graph TD A["Words as Text"] --> B["Word Embeddings"] B --> C["Similar words = Similar numbers"] C --> D["Computer understands meaning!"]

Real Life Examples

Word Nearby Words (Similar Embeddings)
happy joyful, cheerful, glad
sad unhappy, gloomy, upset
king queen, prince, monarch

🎮 Word2Vec: The Prediction Game

Meet the Inventor

In 2013, a team at Google created Word2Vec. It’s like a video game for words!

The Two Games

Word2Vec plays one of two games:

Game 1: CBOW (Continuous Bag of Words)

Challenge: Guess the missing word!

"The ___ barks loudly"

Your brain says: “dog”! 🐕

CBOW sees the surrounding words (“The”, “barks”, “loudly”) and predicts the center word.

Game 2: Skip-gram

Challenge: Given one word, guess its neighbors!

Given: "dog"
Predict: "The", "barks", "loudly"

This is like playing 20 questions backwards!

How It Learns

graph TD A["Read millions of sentences"] --> B["Play prediction game"] B --> C["Make mistakes"] C --> D["Adjust word numbers"] D --> B D --> E["Words with similar context get similar numbers!"]

The Magic Result

After reading billions of words, Word2Vec discovers amazing patterns:

king - man + woman = queen
paris - france + italy = rome

It learned relationships without anyone teaching them directly!

Simple Example

Imagine Word2Vec reads:

  • “I love my cat”
  • “I love my dog”
  • “I love my hamster”

It notices “cat”, “dog”, and “hamster” appear in the same position. They must be similar!


🌍 GloVe: The Big Picture Approach

A Different Strategy

GloVe (Global Vectors) was created at Stanford in 2014. It takes a different approach than Word2Vec.

The Co-occurrence Matrix

GloVe first builds a giant table. It counts how often words appear together across ALL texts.

Example Matrix:

Word the ice steam water
solid 1 8 0 2
gas 0 0 7 1
liquid 1 0 1 9

Notice: “ice” appears with “solid” a lot. “steam” appears with “gas” a lot.

The Ratio Trick

GloVe looks at ratios. If you want to understand “ice” vs “steam”:

P(solid | ice) / P(solid | steam) = HIGH
P(gas | ice) / P(gas | steam) = LOW

This ratio tells us: ice is solid, steam is gas!

Why GloVe Works

graph TD A["Count all word pairs in text"] --> B["Build co-occurrence matrix"] B --> C["Find patterns in ratios"] C --> D["Create word vectors"] D --> E["Similar meaning = Similar vectors"]

Word2Vec vs GloVe

Feature Word2Vec GloVe
Learns from Local context (nearby words) Global statistics (all text)
Method Prediction game Matrix math
Speed Faster to train Needs more memory
Result Both create great word vectors!

Simple Example

Imagine you’re reading 1000 books about cooking.

GloVe notices:

  • “chef” appears near “kitchen” 500 times
  • “chef” appears near “restaurant” 450 times
  • “chef” appears near “airplane” only 2 times

So GloVe places “chef” close to “kitchen” and “restaurant” in meaning space!


🔌 Embedding Layers: The Neural Network Secret

From Pre-trained to Custom

Word2Vec and GloVe give us pre-made word vectors. But what if we want our own?

Embedding Layers are special layers in neural networks that learn word representations during training!

How It Works

Think of an Embedding Layer as a giant lookup table:

Word → Number (ID) → Vector
"cat" → 42 → [0.2, 0.8, 0.1, ...]
"dog" → 17 → [0.3, 0.7, 0.2, ...]

The Learning Process

graph TD A["Start: Random vectors"] --> B["Train on your task"] B --> C["Adjust vectors based on errors"] C --> D["Vectors improve!"] D --> B D --> E["Final: Meaningful vectors"]

Why Use Embedding Layers?

  1. Custom Fit: They learn the best vectors for YOUR specific task
  2. End-to-End: They train alongside your whole model
  3. Flexible: You control the vector size

Simple Example

Building a movie review classifier:

Step 1: Assign each word a random vector

"amazing" → [0.1, 0.5, 0.2] (random)
"terrible" → [0.4, 0.3, 0.8] (random)

Step 2: Train on reviews

  • “This movie was amazing!” → Positive ✓
  • “Terrible waste of time” → Negative ✓

Step 3: Vectors update!

"amazing" → [0.9, 0.8, 0.1] (learned: positive!)
"terrible" → [0.1, 0.2, 0.9] (learned: negative!)

Pre-trained vs Custom Embeddings

Approach When to Use
Pre-trained (Word2Vec, GloVe) Limited data, general language
Custom Embedding Layer Lots of data, specific domain
Both (Transfer Learning) Start pre-trained, fine-tune!

🎯 Putting It All Together

The Journey of a Word

graph TD A["Raw Word: dog"] --> B{Choose Method} B --> C["Word2Vec: Prediction Game"] B --> D["GloVe: Count Patterns"] B --> E["Embedding Layer: Learn During Training"] C --> F["Vector: 0.3, 0.7, 0.2, 0.8"] D --> F E --> F F --> G["Computer Understands Meaning!"]

Quick Summary

Concept One-Line Explanation
Word Embeddings Numbers that capture word meaning
Word2Vec Learns by predicting words from context
GloVe Learns from global word co-occurrence patterns
Embedding Layers Learns custom word vectors during training

🚀 Why This Changes Everything

Before word embeddings:

  • Computers saw “dog” and “puppy” as completely unrelated
  • Search engines matched exact words only
  • Translations were robotic and wrong

After word embeddings:

  • Google understands synonyms
  • Alexa knows what you mean (not just what you say)
  • Chatbots hold real conversations

You’ve just learned how computers began to truly understand language!


🧪 Try It Yourself (Thought Experiments)

  1. The Analogy Game: If king - man + woman = queen, what might doctor - man + woman equal?

  2. The Similarity Test: Which words should have similar embeddings: “run”, “jog”, “sprint”, “book”?

  3. The Context Game: In these sentences, predict the missing word:

    • “I poured hot ___ into my cup” (coffee? tea? water?)
    • “The ___ flew through the sky” (bird? plane? ball?)

These are exactly the games Word2Vec plays millions of times!


🎉 Congratulations! You now understand how machines learn to read meaning, not just letters. Welcome to the foundation of modern NLP!

Loading story...

Story - Premium Content

Please sign in to view this story and start learning.

Upgrade to Premium to unlock full access to all stories.

Stay Tuned!

Story is coming soon.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.