DDSA Solutions
Fundamentals6 min read·

CAP Theorem and Consistency Models for Interviews

CP vs AP trade-offs, strong vs eventual consistency, linearizability, and when to pick each in system design interviews.

Every distributed system faces partitions, replication lag, and consistency choices. Interviewers ask "SQL or NoSQL?" and really want to know if you understand what you give up. This article complements SQL vs NoSQL with the theory behind those picks — say it in plain language, not textbook proofs.

CAP in one paragraph

CAP: in a network partition (P), you cannot have both perfect Consistency (C) and Availability (A) for every request. You choose CP (refuse writes or reads until quorum agrees) or AP (serve stale data but stay up). In practice partitions are rare but inevitable — design for them anyway. PACELC extends this: else (E), choose Latency (L) vs Consistency (C).

Interview sentence

"Under partition, our cart service stays available with eventual consistency; payment ledger is CP — we fail closed rather than double-charge."

Consistency levels (name them)

ModelMeaningExample use
Strong / linearizableRead sees latest write globallyBank balance, inventory deduct
SequentialAll nodes agree on operation orderDistributed logs (Kafka partition)
CausalRelated ops seen in cause orderSocial comments thread
EventualReplicas converge if no new writesLike counts, view counters
Read-your-writesUser sees own updatesProfile edit then reload

CP systems

Traditional RDBMS with synchronous replication: write waits for replica ack — higher latency, consistent reads from primary. etcd/ZooKeeper: quorum writes for leader election and locks. During partition, minority side stops accepting writes (unavailable) to avoid split-brain. Use when correctness beats uptime: payments, seat booking, unique ID allocation.

AP systems

Cassandra, Dynamo-style stores: write to W nodes, read from R nodes; tune quorum (R + W > N) for desired consistency. Under partition, both sides may accept writes — resolve later (last-write-wins, vector clocks, or application merge). Great for high write throughput: metrics, activity feeds, caching layers.

Real interview mappings

URL shortener redirect

AP is fine — stale redirect for seconds after create is acceptable; cache TTL hides replication lag.

Chat message order

Per-conversation sequential consistency via Kafka partition key; global order not required.

Ticket inventory

CP — optimistic locking or DB row lock; never sell the same seat twice. Sacrifice availability during DB failover rather than oversell.

Quorum reads and writes

N replicas, write to W, read from R. If R + W > N, reads see overlapping writes (tunable strong-ish reads). W=1, R=1 is fast and loose. W=N, R=1 gives strong writes. Mention this when interviewer asks "how does Cassandra consistency work?"

Common mistakes

MistakeBetter answer
"We need strong consistency everywhere"Pick per component; feeds can be eventual
"NoSQL is always AP"MongoDB/Cockroach offer tunable consistency
Ignoring read-your-writesRoute user reads to primary or sticky session after write
No conflict resolution storyLWW, version vectors, or business merge rules

Latency vs consistency (PACELC)

Even without partition, stronger consistency costs latency (cross-region quorum). Multi-region news feed often uses eventual replication; fraud check uses strong read on payment DB in one region.

Worked example: checkout vs feed

Advertisement
ComponentConsistencyWhy
Inventory countStrong / linearizablePrevent overselling last item
Shopping cartSession sticky + RYWUser sees own adds immediately
Product reviews listEventualNew review visible within seconds OK
Search indexEventualAsync indexer from queue

Read repair and anti-entropy

AP replicas may diverge briefly. Read repair: on read, client reads R replicas and returns latest version, writing back to stale nodes. Anti-entropy: background Merkle-tree compare between replicas. Mention as v2 ops detail — shows you know AP systems converge.

What to say in the last five minutes

Map each box in your design to C, A, or a tunable middle. "Postgres primary for orders, Redis cache eventual for catalog, Kafka for analytics." One sentence per component beats abstract CAP lecture.

Sample opening (first three minutes)

Interviewer: "How do you handle consistency?" You: "I split by use case. Money and inventory need linearizable writes — CP, single leader or quorum. Social counters and analytics can be eventual — AP with async replication. I always state what the user can tolerate: stale likes for 5 seconds vs never stale balance."

Partition behavior walkthrough

Network split isolates US-East and US-West. CP ledger: one side stops accepting writes — payments pause briefly. AP catalog cache: both sides serve reads; prices may disagree until heal. Interview gold: name which side of the split loses availability for CP and what user-visible symptom follows ("payment unavailable" vs "stale product image").

Session guarantees users feel

  • Read-your-writes: route to primary after user mutation.
  • Monotonic reads: never go backward in time on refresh.
  • Consistent prefix: see A before B if A caused B (comment threads).

Tie-in to other articles

News feed timelines are eventual. Rate limiter counters are often AP with TTL. File storage metadata wants strong consistency for ACL; CDN blob cache is eventual. Build a consistency map on the whiteboard — interviewers love it.

FAQ-style follow-ups

  • "Is Kafka CP or AP?" — AP for availability; ordering per partition is sequential.
  • "Does cache break CAP?" — Cache is separate layer; define consistency per read path.
  • "Two-phase commit?" — Strong cross-shard; slow; avoid except critical financial batches.
  • "What does MongoDB default give?" — Often read-your-writes on primary; eventual on secondaries.

Design exercise: ticket booking

  1. Seat map row in DB with version column — optimistic lock on book.
  2. On conflict, user retries different seat — strong consistency on inventory row.
  3. Waitlist counter can be eventual — off by one acceptable.
  4. Confirmation email via async queue — at-least-once OK with idempotent send.
  5. Payment row on same shard as order_id — single-shard transaction.

Design exercise: social like button

Increment like counter in Redis (AP). Periodic flush to Postgres. User sees own like immediately (local cache). Global count may lag 1–2 seconds — acceptable. If interviewer asks "can likes go negative on unlike?" — use clamp at zero and idempotent unlike key.

Linearizability vs serializability

Linearizability: every operation appears instantaneous between invocation and response — strongest single-object guarantee. Serializable transactions: DB schedules transactions as if serial — enough for many ledgers. You do not need both names in every interview; use when comparing Postgres serializable vs Cassandra tunable reads.

Mock interview checklist

  1. Explained CAP without claiming you can have all three.
  2. Named consistency level per component in a design.
  3. Gave CP example (payments) and AP example (metrics).
  4. Mentioned quorum or partition behavior briefly.
  5. Connected to SQL vs NoSQL choice.

Closing summary

CAP is not a label you slap on a database — it is a per-operation choice. Strong where invariants matter, eventual where humans tolerate delay, and always say what happens when the network splits.

More in this series