Databases — NoSQL Overview: CAP Theorem, Eventual Consistency, and When to Use What
Explore NoSQL database types (key-value, document, column-family, graph), the CAP theorem, consistency models, and practical trade-offs vs SQL.
Why NoSQL?
Relational databases have powered applications for decades. But at scale, they hit real walls:
- Schema rigidity — Every schema change requires
ALTER TABLE, which can lock production tables for hours. Companies like Meta (Facebook) learned this the hard way when running MySQL at billions of rows. - Horizontal scaling is hard — SQL databases are designed to scale vertically (bigger machine). Sharding a relational database requires significant application-level logic.
- One-size-fits-all data model — Not all data fits neatly into rows and columns. Social graphs, event streams, and hierarchical configs need different abstractions.
NoSQL emerged as a response to these challenges. Companies like Netflix, Uber, and Discord adopted NoSQL databases to handle massive scale, flexible schemas, and specific access patterns.
Definition: "NoSQL" originally meant "non-SQL" or "not only SQL." These databases relax some ACID guarantees in exchange for flexibility, scalability, or performance.
Four Types of NoSQL Databases
NoSQL isn't one thing. It's a category of databases that trade off the relational model for different strengths.
1. Key-Value Stores
The simplest NoSQL model. Every item is stored as a key and its associated value — like a giant hash map.
| Database | Key Example | Value Example |
|---|---|---|
| Redis | "session:abc123" | {"user_id": 42, "role": "admin"} |
| DynamoDB | "USER#42" | JSON document with user attributes |
| Riak | "blog:post:99" | HTML or JSON content |
When to use:
- Caching (Redis is the gold standard — see Discord's caching strategy)
- Session storage
- Rate limiting counters
- Leaderboards
Trade-offs:
- Blazingly fast O(1) reads/writes
- No complex queries — you can only look up by key
- No relationships between items
Real-world example: Twitter (now X) uses Redis for timelines and caching. Discord uses Redis for presence and rate limiting.
2. Document Stores
Documents are stored as JSON, BSON, or similar formats. Unlike key-value stores, the database understands the structure inside each document and can query by fields within it.
| Database | Format | Notable Users |
|---|---|---|
| MongoDB | BSON | Forbes, Toyota, Intuit |
| Couchbase | JSON | Walmart, Cisco, United Airlines |
| RethinkDB | JSON | - |
| Firebase/Firestore | JSON | Countless mobile apps |
When to use:
- Content management systems
- User profiles with varying attributes
- Rapid prototyping (schema-less by default)
- When your application naturally works with JSON
Trade-offs:
- Flexible schema — each document can have different fields
- Rich query support (unlike key-value stores)
- Can be harder to maintain data consistency across documents
Deep dive: MongoDB's schema design guide is an excellent resource. The key insight: embed related data when you read it together, reference it when it's accessed independently.
3. Column-Family / Wide-Column Stores
Instead of storing data row-by-row, wide-column stores organize data by columns. Each row can have a different set of columns, and columns are grouped into "column families."
Row Key: "user:42"
┌─────────────────────┬──────────────────────────┐
│ Column Family │ Columns │
├─────────────────────┼──────────────────────────┤
│ profile │ name=Alice │
│ │ email=alice@example.com │
│ │ age=30 │
├─────────────────────┼──────────────────────────┤
│ activity │ login:ts=1700000000 │
│ │ login:ip=192.168.1.1 │
│ │ purchase:ts=1700000100 │
│ │ purchase:item=book │
└─────────────────────┴──────────────────────────┘
| Database | Notable Users |
|---|---|
| Apache Cassandra | Netflix, Apple, Instagram, Uber |
| Apache HBase | Facebook (Messages), Salesforce |
| Google Bigtable | Google Analytics, Gmail |
| Amazon Keyspaces | AWS-native Cassandra-compatible |
When to use:
- Massive datasets (petabytes)
- Write-heavy workloads (Cassandra handles millions of writes/sec)
- Time-series data (IoT, monitoring, logging)
- When you need to scale across multiple data centers
Case study: Netflix's use of Cassandra is legendary — they run one of the largest Cassandra deployments in the world, handling trillions of API calls per day. The Apple migration story is equally impressive.
Trade-offs:
- Excellent write performance (log-structured merge trees)
- Naturally distributed across nodes
- Queries are limited — you typically query by partition key
- Eventual consistency by default (more on this below)
4. Graph Databases
Graph databases store data as nodes (entities), edges (relationships), and properties (attributes on both). They're optimized for traversing relationships.
| Database | Query Language | Notable Users |
|---|---|---|
| Neo4j | Cypher | Walmart, eBay, UBS |
| Amazon Neptune | Gremlin, SPARQL | AWS customers |
| JanusGraph | Gremlin | Google, Grakn |
When to use:
- Social networks (friend recommendations)
- Fraud detection (finding circular transactions)
- Recommendation engines ("people who bought X also bought Y")
- Knowledge graphs and semantic data
Case study: eBay's fraud detection with Neo4j finds fraudulent seller networks by traversing relationships that would require dozens of JOINs in SQL. LinkedIn's Graph API is another canonical example.
Trade-offs:
- Traversing relationships is O(1) — much faster than SQL JOINs for deep queries
- Not great for bulk analytics (aggregate over millions of rows)
- Specialized — use when relationships are the primary query pattern
The CAP Theorem
The CAP Theorem is the most important concept to understand when choosing a distributed database. It states that in a distributed system, you can only guarantee two of the three properties simultaneously.
| Property | Definition |
|---|---|
| Consistency (C) | Every read receives the most recent write or an error |
| Availability (A) | Every request receives a (non-error) response, without guarantee it's the most recent |
| Partition Tolerance (P) | The system continues to operate despite network partitions (messages lost between nodes) |
Critical insight: In a distributed system, partitions will happen (network outages, node failures). So P is not a choice — it's a given. The real decision is: when a partition occurs, do you sacrifice consistency (AP) or availability (CP)?
CAP in Practice
CP Systems (Consistency + Partition Tolerance):
- When a partition occurs, the system blocks writes to the unavailable partition
- You'll get an error if you try to write to a downed node
- Example: MongoDB with replica sets — if the primary goes down, writes pause until a new primary is elected
AP Systems (Availability + Partition Tolerance):
- When a partition occurs, the system accepts writes on all nodes
- Different nodes may have different data temporarily
- Example: Cassandra — writes succeed on any node, and data replicates in the background
Further reading: Eric Brewer's original CAP presentation (2000) and his 12-years-later reflection on InfoQ are essential reads. The takeaway: the trade-off is not a binary switch but a spectrum.
Consistency Models
"Eventual consistency" is a phrase you'll hear often. But it's just one point on a spectrum of consistency models.
| Model | Guarantee | Trade-off |
|---|---|---|
| Linearizable | Every read returns the most recent write | Highest latency, lowest availability |
| Sequential | Writes appear in the same order for all nodes | Still requires coordination |
| Causal | Causally related writes are seen in order | Concurrent writes may appear in different orders |
| Eventual | All reads will eventually return the same value | Reads may return stale data |
Strong Consistency
- Every read sees the latest write
- Example: PostgreSQL, MySQL — a single-node RDBMS is strongly consistent by default
- Distributed example: Google Spanner uses TrueTime (atomic clocks + GPS) to achieve strong consistency globally. Read the Spanner paper for the full story.
Eventual Consistency
- Writes propagate asynchronously
- After some time (milliseconds to seconds), all nodes converge
- Example: DynamoDB, Cassandra, Riak
- How Discord uses it: Discord's architecture relies on Cassandra's eventual consistency to handle billions of messages with high write throughput.
Tunable Consistency
Some databases let you choose the consistency level per query:
Cassandra example:
ONE — Only one replica must acknowledge (fast, least consistent)
QUORUM — Majority of replicas must acknowledge (balanced)
ALL — All replicas must acknowledge (slow, most consistent)
Interview tip: When asked "what consistency model would you use?", the answer is almost never "strong everywhere." Think about which data needs strong consistency (financial transactions, user authentication) and which can be eventually consistent (social media feeds, product recommendations).
When to Use NoSQL vs SQL
Decision Framework
Summary Comparison
| Dimension | SQL | NoSQL |
|---|---|---|
| Schema | Fixed, predefined | Flexible, dynamic |
| Scaling | Vertical (mostly) | Horizontal (native) |
| Transactions | ACID, multi-row | Limited (single-document in most) |
| Query language | Standard SQL | Varies (API, Cypher, CQL) |
| Best for | Structured data, complex queries, ACID | Unstructured data, massive scale, flexible schema |
| Examples | PostgreSQL, MySQL, Oracle | MongoDB, Cassandra, Redis, Neo4j |
Real-World Patterns
Most production systems use polyglot persistence — multiple database types for different workloads:
- Uber: PostgreSQL for core data + Cassandra for trip history + Redis for real-time matching
- Netflix: PostgreSQL for billing/accounts + Cassandra for viewing history and recommendations + DynamoDB for metadata
- Discord: PostgreSQL for core data + Cassandra for messages + Redis for presence and caching
The golden rule: Use SQL unless you have a specific reason not to. NoSQL solves real problems, but "I might need to scale" is not a good reason to start with NoSQL. Start with PostgreSQL, and add specialized databases when your access patterns demand it.
What to Remember for Interviews
- Four NoSQL types — Key-value (Redis), Document (MongoDB), Wide-column (Cassandra), Graph (Neo4j)
- CAP Theorem — In distributed systems, P is given; choose CP or AP
- Consistency spectrum — Strong > Sequential > Causal > Eventual
- Tunable consistency — Cassandra and DynamoDB let you pick per-query
- Polyglot persistence — Real systems use multiple databases for different needs
- Default choice — SQL unless you have a specific reason for NoSQL