Case Study7 min read·June 18, 2026

Design a File Storage System (Dropbox / Google Drive)

System design for cloud file storage: upload chunking, metadata vs blob storage, sync, conflict resolution, and CDN delivery for interviews.

File storage interviews test whether you separate metadata from bytes, handle large uploads, and reason about sync across devices. It combines API design, object storage (S3), and caching. Clarify with the interview framework: personal files vs shared folders, max file size, and real-time sync vs eventual consistency.

Requirements

Functional

Upload, download, delete files and folders.
Sync across multiple devices for the same user.
Share files with other users (read/write permissions).
Support large files (multi-GB) with resumable upload.
Deduplication optional - same content stored once (content-addressable).

Non-functional

Metadata operations under 100ms; blob throughput limited by client bandwidth.
99.9% durability for blobs (S3 replication).
Upload resume after network drop without re-sending completed chunks.
ACL enforced on every metadata and download path.

High-level architecture

Component	Role
Upload/Download API	REST + pre-signed URLs for blob transfer
Metadata DB (PostgreSQL)	file_id, user_id, path, version, blob_id, updated_at
Blob store (S3)	Actual file bytes, keyed by content hash or blob_id
Sync service	Long polling or WebSocket for change notifications
Block/chunk service	Split large files into fixed-size chunks
CDN	Serve popular downloads at edge

Upload flow (large file)

Client POST /v1/files/init { name, size, parent_folder_id } → file_id, upload_id.
Client splits file into 4MB chunks; compute hash per chunk.
For each chunk: POST /v1/files/{file_id}/chunks { index, hash } → pre-signed S3 PUT URL if chunk new.
Skip upload if server already has chunk hash (dedup).
POST /v1/files/{file_id}/complete { upload_id, chunk_list } → metadata commit.
Metadata row points to ordered list of blob chunk IDs.

Why direct-to-S3

Bytes never stream through your API servers - only metadata and signed URLs. This is how you scale uploads without melting the app tier. Same pattern as news feed media upload.

Download flow

GET /v1/files/{file_id}/download → check ACL.
Resolve chunk list from metadata; generate pre-signed GET URLs (or single URL if small file).
Client downloads chunks in parallel; reassemble locally.
Hot public files: serve via CDN with cache key = content hash.

Sync across devices

Each file has monotonic version or updated_at. Client stores last_sync_cursor. On app open: GET /v1/sync?since=cursor → list of changed files (metadata only). Client pulls new blobs as needed. For near-real-time: WebSocket notifies "file X changed" - lighter than full chat but same push idea.

Multi-device edge cases

Device offline for days: cursor may expire; fall back to full metadata snapshot for that user.
Same file edited on two laptops offline: conflict copies or LWW - state clearly in interview.
Partial upload on phone: resume with same upload_id until TTL; other devices see file only after complete.
Delete on web while mobile is offline: tombstone in sync delta; mobile removes local copy on next sync.

Conflict resolution

Two devices edit offline: last-write-wins on metadata timestamp is simplest. Better UX: keep both versions as file (conflict copy). Interview answer: "I would start with LWW and mention conflict copies as v2." Use unique IDs for file versions.

Sharing and ACL

shares table: file_id, grantee_user_id, permission (read/write).
Check permission on every metadata and download request.
Shared folder = tree of file_ids with inherited ACL (cache expanded ACL in Redis).

Data model

files: file_id, owner_id, parent_id, name, is_folder, latest_version
file_versions: version_id, file_id, chunk_ids[], size, created_at
chunks: chunk_hash, s3_key, size (dedup table)
shares: file_id, user_id, role

Capacity estimation

50M users, 5GB average stored → 250PB logical; with 30% dedup by chunk hash → ~175PB in S3. Metadata: 500 files/user × 200 bytes ≈ 5TB relational - tiny vs blobs. API: 10M DAU × 20 metadata ops/day ≈ 2,300 RPS average; upload init spikes higher. Blob egress dominates cost - CDN for shared public links, infrequent-access tier for cold archives.

Worked example: 2GB video upload

Client calls init → receives file_id and upload_id.
Splits into 512 × 4MB chunks; hashes each locally.
For chunk 0: server returns pre-signed PUT URL; client uploads directly to S3.
Chunk 47 already exists (same hash as another user's file) → server skips PUT, records chunk_hash in upload session.
Complete commits file_versions row with ordered chunk list; sync pushes metadata delta to other devices.
Other laptop sees new file in sync delta; downloads only missing chunks in parallel.

Capacity and cost

Storage cost dominates - S3 + infrequent access tiers. Metadata is tiny vs blobs. 100M users × 10GB average = 1EB storage - mention sharding metadata by user_id and geographic S3 buckets. API tier scales with load balancing; blob tier scales with object store.

Failure modes

Failure	Mitigation
Chunk upload incomplete	upload_id expires; garbage-collect orphan chunks
Duplicate complete request	Idempotency on complete endpoint
S3 outage	Retry; multi-region replication for enterprise tier

Small file fast path

Files under 5MB: single pre-signed PUT, no chunk orchestration. Metadata and blob commit in one transaction. Reduces API round-trips for photos and documents - most user files are small.

Trash and versioning

Soft-delete: set deleted_at on metadata; garbage-collect blobs after 30 days if no version references chunk hash. Version history: new row in file_versions on each save; current pointer on files table. Users restore previous version by pointing latest_version backward.

Latency budget

Operation	Target
List folder metadata	< 100ms
Init upload (API only)	< 50ms
Chunk PUT (direct S3)	Limited by client bandwidth
Sync delta (metadata only)	< 200ms

Security

Pre-signed URLs expire in 15 minutes.
Encrypt blobs at rest (S3 SSE).
Virus scan optional hook on complete upload.
ACL check on every metadata and download path.
Rate limit upload init per user.

Public share links

Optional: share_token (random UUID) maps to file_id with read-only ACL. GET /s/{token} redirects to CDN signed URL - similar to URL shortener opaque links. Revoke by deleting share row.

API summary

Endpoint	Purpose
POST /v1/files/init	Start upload; return file_id + upload_id
POST /v1/files/{id}/chunks	Get pre-signed URL per chunk
POST /v1/files/{id}/complete	Commit metadata after all chunks
GET /v1/files/{id}/download	Pre-signed GET URLs
GET /v1/folders/{id}/children	List folder metadata
GET /v1/sync?since=cursor	Delta sync for client

Folder hierarchy

Folders are rows with is_folder=true. Path display is computed from parent chain or materialized path (/user/docs/2024). List children: SELECT * FROM files WHERE parent_id = ? AND deleted_at IS NULL - index on (parent_id, name) for fast folder browsing. Rename = update one metadata row; move = change parent_id with cycle check.

Metadata sharding

Shard PostgreSQL by user_id hash when metadata QPS grows. Each user's tree lives on one shard - no cross-shard folder moves in v1. Blobs stay in global S3; only metadata shards. Cross-user share references file_id UUID globally unique via Snowflake-style IDs.

Garbage collection

Orphan chunks: uploaded but no file_version references after upload_id TTL.
Deleted files: soft-delete metadata; after 30 days remove chunk refs.
Reference-count chunks table; delete S3 object when refcount hits zero.
Run GC as nightly async job - never on request path.

Metadata vs blob responsibilities

Concern	Metadata DB	Blob store
Name, path, ACL	Yes	No
File bytes	No	Yes
Dedup by hash	Chunk registry	Content-addressed keys
CDN cache	No	Yes (GET URLs)
Transactional rename	Yes	Unchanged blobs

Sample opening (first three minutes)

Interviewer: "Design Dropbox." You: "I will separate file metadata in PostgreSQL from blobs in S3. Uploads use pre-signed URLs so bytes never hit our API servers. Large files are chunked with content-hash dedup. Clients sync via metadata cursor and pull only changed blobs. Sharing uses ACL checks on every download."

What to say in the last five minutes

Close with: "Metadata in Postgres, blobs in S3 via pre-signed URLs, chunked upload with content-hash dedup, sync via cursor + optional WebSocket, ACL on every access." That is a complete Dropbox-level answer.

Mock interview checklist

Separated metadata DB from blob object store.
Walked chunked upload with content-hash dedup.
Explained pre-signed S3 URLs - bytes bypass API tier.
Described sync cursor and conflict strategy.
Mentioned ACL on every access path.

Closing summary

Never route file bytes through your API at scale. Metadata path and blob path are separate designs - nail both in the interview.

More in this series

Design a URL Shortener - Complete Interview Walkthrough Design a Distributed Rate Limiter Design a News Feed (Twitter / Instagram Home Timeline)Design a Chat / Messaging System (WhatsApp / Slack DM)