NoSQL Replication: Read & Write Configuration
The Library Story π
Imagine you run a magical library with multiple branches across town. You have ONE main library (the Primary) where all new books arrive first. Then you have several branch libraries (the Replicas) that get copies of those books.
But hereβs the tricky part:
- Who can check out books? (Read)
- Who can add new books? (Write)
- How do we make sure everyone has the same books? (Consistency)
This is exactly what Read Write Configuration solves in NoSQL databases!
π Read Replicas
What Are They?
Read Replicas are copies of your database that can only answer questionsβthey canβt make changes.
Think of them as photocopy machines at each branch library. They have copies of all the books, so people can read them. But if someone wants to donate a new book? They must go to the main library.
Why Use Them?
βββββββββββββββββββ
β MAIN DATABASE β β All writes go here
β (Primary) β
ββββββββββ¬βββββββββ
β copies data to...
ββββββ΄βββββ¬βββββββββ
βΌ βΌ βΌ
βββββββββ βββββββββ βββββββββ
βReplicaβ βReplicaβ βReplicaβ
β 1 β β 2 β β 3 β
βββββββββ βββββββββ βββββββββ
β² β² β²
ββββββ¬βββββ΄βββββββββ
β
Many users can
READ from here!
Simple Example:
- Your app has 10,000 users reading data
- Only 100 users writing data
- Without replicas: Main database handles everything = SLOW!
- With 3 replicas: Reads spread across 4 servers = FAST!
π― Read Preferences
What Is It?
Read Preference tells the database: βWhere should I look for my answer?β
Itβs like asking: βWhich library branch should I visit to find my book?β
The Five Choices
| Preference | Where to Read | Real-World Analogy |
|---|---|---|
| primary | Only main library | βI need the LATEST edition!β |
| primaryPreferred | Main first, branch if busy | βLatest if availableβ |
| secondary | Only branch libraries | βAny copy is fineβ |
| secondaryPreferred | Branch first, main if none | βBranch is closer to meβ |
| nearest | Closest library | βIβm in a hurry!β |
Quick Example
// Always read from primary (freshest data)
db.users.find().readPref("primary")
// Read from nearest server (fastest)
db.users.find().readPref("nearest")
When to Use What?
- Showing bank balance? β
primary(must be accurate!) - Showing product reviews? β
secondary(slightly old is OK) - Global users? β
nearest(speed matters most)
βοΈ Write Concerns
What Is It?
Write Concern answers: βHow many copies must confirm they saved my data before I can relax?β
Imagine mailing an important letter. Do you:
- Just drop it in a mailbox? (No confirmation)
- Wait for βdeliveredβ notification? (Some confirmation)
- Wait for the person to call you back? (Full confirmation)
The Levels
βββββββββββββββββββββββββββββββββββββββββββ
β WRITE CONCERN LEVELS β
βββββββββββββββββββββββββββββββββββββββββββ€
β β
β w: 0 β "Fire and forget" β
β Don't wait for anything β
β β‘ Fastest | π² Risky β
β β
β w: 1 β "Primary confirmed" β
β Main server saved it β
β βοΈ Balanced β
β β
β w: 2 β "Primary + 1 replica" β
β Two copies exist β
β π‘οΈ Safer β
β β
β w: "majority" β "Most servers agree" β
β Can survive failures β
β π° Safest | π’ Slower β
β β
βββββββββββββββββββββββββββββββββββββββββββ
Code Example
// Fast but risky (logging, analytics)
db.logs.insertOne(
{ event: "click" },
{ writeConcern: { w: 0 } }
)
// Safe (important user data)
db.users.insertOne(
{ name: "Alice", balance: 100 },
{ writeConcern: { w: "majority" } }
)
π Read Concerns
What Is It?
Read Concern answers: βHow fresh must my data be?β
Back to our library: When you ask for a book list, do you want:
- Whatever the librarian remembers? (Fast but maybe outdated)
- The official catalog? (Accurate but slower)
The Levels
| Level | Meaning | Use When |
|---|---|---|
| local | βGive me what you haveβ | Speed > Accuracy |
| available | βWhatever is fastestβ | Very fast reads |
| majority | βMost servers agreeβ | Need reliable data |
| linearizable | βThe absolute truthβ | Critical operations |
Simple Analogy
π "local" β Ask the nearest employee
π "majority" β Check with most employees
π "linearizable" β Check the official record
Code Example
// Quick check (may be slightly stale)
db.products.find().readConcern("local")
// Reliable read (confirmed by most servers)
db.orders.find().readConcern("majority")
βοΈ Quorum
What Is It?
Quorum means βthe minimum votes needed to make a decision.β
Think of it like a class vote:
- 5 students in class
- Need more than half to agree = at least 3 votes
- Thatβs your quorum!
The Magic Formula
βββββββββββββββββββββββββββββββββββββββ
β β
β Quorum = (Total Servers / 2) + 1 β
β β
β 3 servers β Quorum = 2 β
β 5 servers β Quorum = 3 β
β 7 servers β Quorum = 4 β
β β
βββββββββββββββββββββββββββββββββββββββ
Why It Matters
Scenario: 5 servers, 2 go offline
Without Quorum:
- Remaining 3 might disagree
- Data becomes inconsistent π±
With Quorum (needs 3):
- 3 remaining can still agree β
- System keeps working safely
Real Example
// Write must be confirmed by majority
db.payments.insertOne(
{ amount: 500 },
{ writeConcern: { w: "majority" } }
)
// With 5 servers: waits for 3 to confirm
β±οΈ Replication Lag
What Is It?
Replication Lag is the time delay between:
- Data written to the primary
- Data appearing on replicas
Itβs like when breaking news happens:
- News anchor reports it FIRST
- Then it spreads to social media
- Some people hear it 5 minutes later
That delay? Thatβs lag.
Why It Happens
βββββββββββββββββββββββββββββββββββββββββββ
β β
β PRIMARY β
β βββββββ β
β β NEW β β Data written at 10:00:00 β
β βDATA β β
β ββββ¬βββ β
β β β
β β Network travel time... β
β β β
β ββββ΄βββ β
β βCOPY β β Arrives at 10:00:02 β
β β β β
β βββββββ β
β REPLICA β
β β
β LAG = 2 seconds β
β β
βββββββββββββββββββββββββββββββββββββββββββ
Common Causes
- Network distance - Replica is far away
- Heavy load - Primary is very busy
- Large writes - Big data takes time to copy
- Slow replica - Replica hardware is weak
The Problem It Creates
User writes: "Balance = $100"
β
PRIMARY has $100
User reads (from REPLICA):
β
Still shows $50! π°
(hasn't received update yet)
How to Handle It
// For critical reads: force primary
db.balance.find({ user: "alice" })
.readPref("primary")
// Or wait for majority confirmation
db.balance.find({ user: "alice" })
.readConcern("majority")
π― Putting It All Together
The Decision Flow
graph TD A["Need to READ data?"] --> B{How fresh?} B -->|Latest only| C["readPref: primary"] B -->|Recent enough| D["readPref: secondary"] B -->|Fastest| E["readPref: nearest"] F["Need to WRITE data?"] --> G{How safe?} G -->|Critical| H["w: majority"] G -->|Important| I["w: 2"] G -->|Fast OK| J["w: 1"]
Cheat Sheet Summary
| Situation | Read Preference | Write Concern | Read Concern |
|---|---|---|---|
| Bank balance | primary | majority | majority |
| Social feed | nearest | 1 | local |
| Order placement | primary | majority | linearizable |
| Analytics | secondary | 0 | local |
| Chat messages | nearest | 1 | local |
π Key Takeaways
- Read Replicas = Extra copies for reading (reduces load)
- Read Preferences = Where to read from
- Write Concerns = How many must confirm a write
- Read Concerns = How reliable must the read be
- Quorum = Minimum votes for decisions
- Replication Lag = Delay in copying data
Remember our library analogy:
- Main library = Primary database
- Branch libraries = Replicas
- Choosing which branch = Read Preference
- Confirming book was cataloged = Write Concern
- Checking official catalog = Read Concern
- Voting on decisions = Quorum
- Time for new books to reach branches = Replication Lag
Youβve got this! π
