Foundations

Databases — NoSQL Overview: CAP Theorem, Eventual Consistency, and When to Use What

Explore NoSQL database types (key-value, document, column-family, graph), the CAP theorem, consistency models, and practical trade-offs vs SQL.

14 min readNoSQLCAP theoremeventual consistencyMongoDBCassandraRedisNeo4j

Why NoSQL?

Relational databases have powered applications for decades. But at scale, they hit real walls:

  • Schema rigidity — Every schema change requires ALTER TABLE, which can lock production tables for hours. Companies like Meta (Facebook) learned this the hard way when running MySQL at billions of rows.
  • Horizontal scaling is hard — SQL databases are designed to scale vertically (bigger machine). Sharding a relational database requires significant application-level logic.
  • One-size-fits-all data model — Not all data fits neatly into rows and columns. Social graphs, event streams, and hierarchical configs need different abstractions.

NoSQL emerged as a response to these challenges. Companies like Netflix, Uber, and Discord adopted NoSQL databases to handle massive scale, flexible schemas, and specific access patterns.

💡

Definition: "NoSQL" originally meant "non-SQL" or "not only SQL." These databases relax some ACID guarantees in exchange for flexibility, scalability, or performance.


Four Types of NoSQL Databases

NoSQL isn't one thing. It's a category of databases that trade off the relational model for different strengths.

1. Key-Value Stores

The simplest NoSQL model. Every item is stored as a key and its associated value — like a giant hash map.

DatabaseKey ExampleValue Example
Redis"session:abc123"{"user_id": 42, "role": "admin"}
DynamoDB"USER#42"JSON document with user attributes
Riak"blog:post:99"HTML or JSON content

When to use:

Trade-offs:

  • Blazingly fast O(1) reads/writes
  • No complex queries — you can only look up by key
  • No relationships between items

Real-world example: Twitter (now X) uses Redis for timelines and caching. Discord uses Redis for presence and rate limiting.

2. Document Stores

Documents are stored as JSON, BSON, or similar formats. Unlike key-value stores, the database understands the structure inside each document and can query by fields within it.

DatabaseFormatNotable Users
MongoDBBSONForbes, Toyota, Intuit
CouchbaseJSONWalmart, Cisco, United Airlines
RethinkDBJSON-
Firebase/FirestoreJSONCountless mobile apps

When to use:

  • Content management systems
  • User profiles with varying attributes
  • Rapid prototyping (schema-less by default)
  • When your application naturally works with JSON

Trade-offs:

  • Flexible schema — each document can have different fields
  • Rich query support (unlike key-value stores)
  • Can be harder to maintain data consistency across documents
💡

Deep dive: MongoDB's schema design guide is an excellent resource. The key insight: embed related data when you read it together, reference it when it's accessed independently.

3. Column-Family / Wide-Column Stores

Instead of storing data row-by-row, wide-column stores organize data by columns. Each row can have a different set of columns, and columns are grouped into "column families."

Row Key: "user:42" ┌─────────────────────┬──────────────────────────┐ │ Column Family │ Columns │ ├─────────────────────┼──────────────────────────┤ │ profile │ name=Alice │ │ │ email=alice@example.com │ │ │ age=30 │ ├─────────────────────┼──────────────────────────┤ │ activity │ login:ts=1700000000 │ │ │ login:ip=192.168.1.1 │ │ │ purchase:ts=1700000100 │ │ │ purchase:item=book │ └─────────────────────┴──────────────────────────┘
DatabaseNotable Users
Apache CassandraNetflix, Apple, Instagram, Uber
Apache HBaseFacebook (Messages), Salesforce
Google BigtableGoogle Analytics, Gmail
Amazon KeyspacesAWS-native Cassandra-compatible

When to use:

  • Massive datasets (petabytes)
  • Write-heavy workloads (Cassandra handles millions of writes/sec)
  • Time-series data (IoT, monitoring, logging)
  • When you need to scale across multiple data centers

Case study: Netflix's use of Cassandra is legendary — they run one of the largest Cassandra deployments in the world, handling trillions of API calls per day. The Apple migration story is equally impressive.

Trade-offs:

  • Excellent write performance (log-structured merge trees)
  • Naturally distributed across nodes
  • Queries are limited — you typically query by partition key
  • Eventual consistency by default (more on this below)

4. Graph Databases

Graph databases store data as nodes (entities), edges (relationships), and properties (attributes on both). They're optimized for traversing relationships.

DatabaseQuery LanguageNotable Users
Neo4jCypherWalmart, eBay, UBS
Amazon NeptuneGremlin, SPARQLAWS customers
JanusGraphGremlinGoogle, Grakn

When to use:

  • Social networks (friend recommendations)
  • Fraud detection (finding circular transactions)
  • Recommendation engines ("people who bought X also bought Y")
  • Knowledge graphs and semantic data
💡

Case study: eBay's fraud detection with Neo4j finds fraudulent seller networks by traversing relationships that would require dozens of JOINs in SQL. LinkedIn's Graph API is another canonical example.

Trade-offs:

  • Traversing relationships is O(1) — much faster than SQL JOINs for deep queries
  • Not great for bulk analytics (aggregate over millions of rows)
  • Specialized — use when relationships are the primary query pattern

The CAP Theorem

The CAP Theorem is the most important concept to understand when choosing a distributed database. It states that in a distributed system, you can only guarantee two of the three properties simultaneously.

PropertyDefinition
Consistency (C)Every read receives the most recent write or an error
Availability (A)Every request receives a (non-error) response, without guarantee it's the most recent
Partition Tolerance (P)The system continues to operate despite network partitions (messages lost between nodes)
⚠️

Critical insight: In a distributed system, partitions will happen (network outages, node failures). So P is not a choice — it's a given. The real decision is: when a partition occurs, do you sacrifice consistency (AP) or availability (CP)?

CAP in Practice

CP Systems (Consistency + Partition Tolerance):

  • When a partition occurs, the system blocks writes to the unavailable partition
  • You'll get an error if you try to write to a downed node
  • Example: MongoDB with replica sets — if the primary goes down, writes pause until a new primary is elected

AP Systems (Availability + Partition Tolerance):

  • When a partition occurs, the system accepts writes on all nodes
  • Different nodes may have different data temporarily
  • Example: Cassandra — writes succeed on any node, and data replicates in the background
💡

Further reading: Eric Brewer's original CAP presentation (2000) and his 12-years-later reflection on InfoQ are essential reads. The takeaway: the trade-off is not a binary switch but a spectrum.


Consistency Models

"Eventual consistency" is a phrase you'll hear often. But it's just one point on a spectrum of consistency models.

ModelGuaranteeTrade-off
LinearizableEvery read returns the most recent writeHighest latency, lowest availability
SequentialWrites appear in the same order for all nodesStill requires coordination
CausalCausally related writes are seen in orderConcurrent writes may appear in different orders
EventualAll reads will eventually return the same valueReads may return stale data

Strong Consistency

  • Every read sees the latest write
  • Example: PostgreSQL, MySQL — a single-node RDBMS is strongly consistent by default
  • Distributed example: Google Spanner uses TrueTime (atomic clocks + GPS) to achieve strong consistency globally. Read the Spanner paper for the full story.

Eventual Consistency

  • Writes propagate asynchronously
  • After some time (milliseconds to seconds), all nodes converge
  • Example: DynamoDB, Cassandra, Riak
  • How Discord uses it: Discord's architecture relies on Cassandra's eventual consistency to handle billions of messages with high write throughput.

Tunable Consistency

Some databases let you choose the consistency level per query:

Cassandra example: ONE — Only one replica must acknowledge (fast, least consistent) QUORUM — Majority of replicas must acknowledge (balanced) ALL — All replicas must acknowledge (slow, most consistent)

Interview tip: When asked "what consistency model would you use?", the answer is almost never "strong everywhere." Think about which data needs strong consistency (financial transactions, user authentication) and which can be eventually consistent (social media feeds, product recommendations).


When to Use NoSQL vs SQL

Decision Framework

Summary Comparison

DimensionSQLNoSQL
SchemaFixed, predefinedFlexible, dynamic
ScalingVertical (mostly)Horizontal (native)
TransactionsACID, multi-rowLimited (single-document in most)
Query languageStandard SQLVaries (API, Cypher, CQL)
Best forStructured data, complex queries, ACIDUnstructured data, massive scale, flexible schema
ExamplesPostgreSQL, MySQL, OracleMongoDB, Cassandra, Redis, Neo4j

Real-World Patterns

Most production systems use polyglot persistence — multiple database types for different workloads:

⚠️

The golden rule: Use SQL unless you have a specific reason not to. NoSQL solves real problems, but "I might need to scale" is not a good reason to start with NoSQL. Start with PostgreSQL, and add specialized databases when your access patterns demand it.


What to Remember for Interviews

  1. Four NoSQL types — Key-value (Redis), Document (MongoDB), Wide-column (Cassandra), Graph (Neo4j)
  2. CAP Theorem — In distributed systems, P is given; choose CP or AP
  3. Consistency spectrum — Strong > Sequential > Causal > Eventual
  4. Tunable consistency — Cassandra and DynamoDB let you pick per-query
  5. Polyglot persistence — Real systems use multiple databases for different needs
  6. Default choice — SQL unless you have a specific reason for NoSQL