The 4-Step System Design Framework
System design interviews test your ability to think at scale, make trade-offs, and communicate complex architectures clearly. Unlike coding interviews, there's no single "right" answer — interviewers evaluate your thought process, communication skills, and ability to handle ambiguity. This 4-step framework works for any system design question.
Requirements Clarification
Never start designing immediately. Ask clarifying questions to nail down the scope.
Questions to ask:
- • Functional: What are the core features? What can users do?
- • Non-functional: What's the expected scale? Latency requirements? Availability targets?
- • Constraints: Budget limitations? Existing infrastructure? Geographic distribution?
Back-of-the-Envelope Estimation
Estimate the scale to guide your design decisions. This shows the interviewer you think about real-world constraints.
Key estimates:
- • DAU/MAU: How many daily and monthly active users?
- • QPS: Queries per second (read vs. write ratio)
- • Storage: How much data over 5 years?
- • Bandwidth: Network throughput requirements
High-Level Design
Draw the main components and data flow. Keep it simple — boxes and arrows — and talk through each component's purpose.
Components to consider:
- • API Gateway / Load Balancer: Entry point for all requests
- • Application Servers: Business logic tier
- • Database(s): SQL vs. NoSQL, read replicas, sharding
- • Cache Layer: Redis/Memcached for hot data
- • Message Queue: Kafka/RabbitMQ for async processing
- • CDN: Static content delivery
Deep Dive & Trade-offs
This is where you differentiate yourself. Pick 2–3 components and dive deep into the design challenges.
Topics to explore:
- • Scaling: Horizontal vs. vertical, database sharding strategies
- • Consistency vs. Availability: CAP theorem trade-offs
- • Fault Tolerance: What happens when a component fails?
- • Monitoring: How do you detect and debug issues?
Top 6 System Design Questions (With Key Concepts)
Design a URL Shortener (like bit.ly)
Tests your understanding of hashing, database design, and read-heavy systems.
Key Concepts:
- • Base62 encoding for short URLs
- • Key-value store (DynamoDB/Redis)
- • Read-heavy: 100:1 read-to-write ratio
- • Cache layer for popular URLs
Scale to discuss:
- • ~500M new URLs per month
- • ~50B redirects per month
- • URL expiration and analytics
- • Custom alias handling
Design a Chat Application (like WhatsApp)
Tests real-time communication, message delivery guarantees, and state management.
Key Concepts:
- • WebSocket connections for real-time
- • Message queue (Kafka) for reliability
- • Delivery receipts (sent/delivered/read)
- • Group chat fan-out strategies
Challenges:
- • Handling offline users
- • Message ordering guarantees
- • End-to-end encryption
- • Media storage and CDN
Design a Social Media Feed (like Twitter/X)
Tests fan-out strategies, ranking algorithms, and handling celebrity accounts.
Key Concepts:
- • Fan-out on write vs. fan-out on read
- • Timeline precomputation
- • Feed ranking / ML scoring
- • Pub/Sub for real-time updates
Challenges:
- • Celebrity problem (millions of followers)
- • Hot partition handling
- • Spam and content moderation
- • Trending topics computation
Design a Video Streaming Service (like Netflix)
Tests CDN architecture, content encoding, and recommendation systems.
Key Concepts:
- • Adaptive bitrate streaming (HLS/DASH)
- • CDN with edge caching
- • Video transcoding pipeline
- • Recommendation engine
Scale:
- • ~200M+ daily active viewers
- • Multiple video resolutions (360p–4K)
- • Cold start recommendation problem
- • Content licensing by region
Design a Ride-Sharing Service (like Uber)
Tests location-based services, real-time matching, and surge pricing algorithms.
Key Concepts:
- • Geospatial indexing (QuadTree/GeoHash)
- • Real-time driver location tracking
- • Matching algorithm (nearest driver)
- • ETA calculation with traffic data
Challenges:
- • Surge pricing during high demand
- • Payment processing at scale
- • Handling network partitions
- • Trip state machine
Design a Distributed Cache (like Redis)
Tests distributed systems fundamentals, consistency models, and eviction policies.
Key Concepts:
- • Consistent hashing for distribution
- • Eviction policies (LRU, LFU, TTL)
- • Replication for high availability
- • Cache-aside vs. write-through
Deep Dive:
- • Cache stampede prevention
- • Hot key handling
- • Data serialization formats
- • Cluster management and resharding
Essential Concepts to Master
Regardless of the question, these core concepts come up in almost every system design interview. Make sure you can explain each one clearly and know when to apply them.
CAP Theorem
A distributed system can guarantee at most 2 of 3: Consistency, Availability, Partition tolerance. Know when to choose CP vs. AP.
Load Balancing
Round-robin, least connections, IP hash, and consistent hashing. Know when each strategy is appropriate.
Database Sharding
Horizontal partitioning strategies: range-based, hash-based, geographic. Understand resharding challenges.
Caching Strategies
Cache-aside (Lazy loading), Write-through, Write-behind. Cache invalidation remains one of the hardest problems.
Message Queues
Decouple components with Kafka, RabbitMQ, or SQS. Essential for async processing, event-driven architectures.
Rate Limiting
Token bucket, sliding window, fixed window algorithms. Protect APIs from abuse and ensure fair usage.
Tips for Acing System Design Interviews
-
Think out loud. The interviewer wants to see your reasoning process, not just the final answer. Narrate your trade-offs as you make them.
-
Start simple, then scale. Begin with a design that works for 100 users, then progressively add components to handle millions. This shows maturity.
-
Discuss trade-offs explicitly. "I chose SQL here because we need strong consistency for financial data, but if we needed more write throughput, I'd consider Cassandra." Showing you understand alternatives is as important as the choice itself.
-
Use real numbers. "We'll need about 500 QPS for reads" sounds much more credible than "it'll handle a lot of traffic." Back-of-envelope math builds interviewer confidence.
-
Address failure modes. What happens when the database goes down? When a server crashes? Senior engineers think about resilience from the start.