DDSA Solutions
Case Study6 min read·

Design a Payment System (Stripe / PayPal)

System design for payments: idempotency, ledger, authorization vs capture, PCI scope, and exactly-once money movement for interviews.

Payment systems move money — mistakes are irreversible and regulated. Interviewers test idempotency, ledger accounting, and async provider integration without storing raw card numbers. Pair with API design, unique IDs, and CAP/consistency — payments are the canonical CP component.

Requirements

Functional

  • Merchant charges customer card or wallet.
  • Support authorize (hold) then capture (settle) or one-step charge.
  • Refunds full or partial.
  • Webhooks to merchant on success/failure.
  • Idempotent API — retry safe.

Non-functional

  • Exactly-once money effect — no double charge on retry.
  • PCI DSS: minimize card data in your systems (tokenization).
  • Audit trail for every state change.
  • 99.99% correctness over raw availability.

PCI scope

Never log full PAN/CVV. Use payment provider iframe or token — your servers touch only tokens. Say this early; interviewers notice.

High-level architecture

ComponentRole
Payment APIPOST /charges with Idempotency-Key header
Payment orchestratorState machine; calls acquirer/processor
Ledger serviceDouble-entry bookkeeping; source of truth
Idempotency store (Redis + DB)Key → payment_id, 24h TTL
Card vault / tokenizerProvider tokens only in your DB
Webhook dispatcherReliable delivery to merchants (queues)
Reconciliation workerMatch provider settlement files to ledger

Payment state machine

  1. CREATED — idempotency record inserted.
  2. AUTHORIZED — funds held at issuer.
  3. CAPTURED — money moved to merchant settlement.
  4. FAILED — decline with reason code.
  5. REFUNDED — partial or full credit.

Transitions append-only in ledger_events table. Current state = latest event or materialized status column updated in same DB transaction as ledger post.

Idempotency flow

  1. Client sends Idempotency-Key: uuid per logical operation.
  2. API begins transaction: insert idempotency_key if absent.
  3. If duplicate key: return stored payment_id and status — no second charge.
  4. Call processor once; store processor_ref on success.
  5. On timeout: client retries same key — server returns in-flight or final state.

Double-entry ledger

Every charge: debit customer liability account, credit merchant payable (minus fee). Every capture settles to merchant bank. Ledger rows never deleted — refunds are reversing entries. Balances computed from sum of entries or cached with periodic reconciliation. Strong consistency on ledger DB — single primary or distributed SQL with serializable transactions.

Authorize vs capture

E-commerce checkout: authorize at order, capture at ship. Ride hailing: authorize estimate at trip start, capture actual at end. Hotel: larger auth hold. Orchestrator schedules capture job; partial capture if final amount lower. Uncaptured auth expires per card network rules (typically 7 days).

Webhooks to merchants

payment.captured → POST merchant URL with HMAC signature. Retry with backoff; DLQ after N failures. Merchant must verify signature and be idempotent on their side too. Store webhook delivery attempts for support.

Failure modes

FailureMitigation
Processor timeoutMark PROCESSING; reconcile via webhook or polling
Duplicate client retryIdempotency key returns same result
Double capture bugUnique constraint on (payment_id, operation)
Ledger/DB splitTransactional outbox for events
ChargebackSeparate dispute workflow; debit merchant payable

Capacity estimation

10K payments/sec peak × 500 bytes metadata ≈ 5 MB/sec to ledger DB — need sharded ledger or partitioned table by merchant_id. Idempotency Redis: 10K keys/sec × 1 KB × 24h ≈ few GB memory. Processor API often limits TPS — queue bursts and smooth with rate limiter on outbound calls.

API sketch

EndpointNotes
POST /v1/chargesIdempotency-Key required; returns payment_id
POST /v1/charges/{id}/captureIdempotent capture of authorized amount
POST /v1/refundsLinks to charge; idempotent
GET /v1/charges/{id}Status for polling after timeout

Worked charge flow

Advertisement
  1. Merchant POST /charges $50, Idempotency-Key: abc.
  2. Insert idempotency row; create payment PENDING.
  3. Call processor → AUTHORIZED.
  4. Ledger: debit customer $50 liability, credit merchant payable $48.50, fee $1.50.
  5. Webhook payment.authorized to merchant.
  6. Later capture job: CAPTURED at processor; ledger settlement entry.

Reconciliation

Nightly batch: processor settlement file vs ledger sums. Mismatch triggers ops queue. Float and FX handled in separate accounts. Never delete rows — adjusting entries only. Required for audits and SOC2 conversations in senior loops.

Fraud and risk (brief)

Rules engine before processor call: velocity limits, geo mismatch, blocklist BIN. Async ML score may decline after auth — void hold. Separate rate limiter per merchant API key. Out of scope for junior interviews unless prompted.

What to say in the last five minutes

Idempotency keys, ledger double-entry, authorize/capture, webhooks with retry, PCI via tokenization. Strong consistency on money — fail closed on partition.

Sample opening (first three minutes)

Interviewer: "Design a payment API." You: "Money movement must be exactly-once from the merchant perspective — idempotency keys on every mutating call, append-only ledger, authorize/capture split. Card data stays at the processor; we store tokens only. I will draw the state machine and explain reconciliation with the acquirer."

Latency and availability trade-off

StepConsistencyLatency
Idempotency check (Redis)Strong per key< 5ms
Ledger write (Postgres primary)Strong< 20ms
Processor API callExternal CP200ms–2s
Webhook to merchantAt-least-onceAsync seconds

Disputes and chargebacks

Chargeback reverses ledger via new entries — debit merchant, credit customer. Evidence bundle from trip/order metadata. Separate workflow — do not block capture path. Links to ride hailing trip polyline as proof of service.

Multi-currency and payouts

Ledger accounts per currency. Settlement batch transfers merchant payable to bank via ACH wire — async queue. FX conversion uses separate rate service with locked quote at charge time. Senior interview extension only.

Testing payments

  • Processor sandbox with test cards for decline paths.
  • Idempotency replay tests — same key 10× → one charge.
  • Chaos: kill processor mid-request; verify PROCESSING + reconcile.

Merchant integration

Merchants receive API keys (publishable vs secret). Server-side charges only with secret key. Client SDK tokenizes card → returns payment_method_id. Your API never sees PAN. Webhook signing secret per merchant for HMAC verify. Same patterns as REST API design: versioned endpoints, idempotent POST, clear error codes (card_declined, insufficient_funds).

Partial capture and refunds

Hotel checkout: authorized $200, capture $180 minibar included. Refund $50 partial — ledger reversing entries linked to original charge_id. Each operation gets its own idempotency key. State machine prevents refund exceeding captured amount.

Settlement vs authorization

Authorization holds funds on issuer side; capture moves money in your ledger; settlement is when acquirer batches to your bank account (T+2 days). Interviewers confuse these — define each when drawing the timeline. Reconciliation matches all three layers.

Mock interview checklist

  1. Stated idempotency and PCI/tokenization.
  2. Drew payment states authorize → capture → refund.
  3. Explained double-entry ledger as source of truth.
  4. Mentioned webhooks and reconciliation.
  5. Chose strong consistency for ledger writes.

Closing summary

Payments are a state machine plus a ledger — idempotency everywhere, never double-charge, audit everything. Get those three right and the rest is integration plumbing.

More in this series