Why I’m writing about this
A few years ago, I worked on a product that scaled into 30+ countries.
I was a PM. And somehow ended up co-designing the app with engineers, because we were a team of three.
I learned fast: what components a solution needs, where to start, what questions to ask to make sure we were solving the right problem.
Fast forward to a few months ago.
New product. Legacy architecture. No documentation. No diagrams. Set up by an external vendor, and we basically prayed it worked every day.
So I did what I always do: I started diagramming. Asking questions. Getting to the bottom of how it worked, its components, its dependencies.
And surprise surprise, I found a few things that we are now changing to make the app more reliable.
Once you’ve done this once, systems stop being scary.
That’s what I want to give you today.
The PM’s role in system design
Let me be clear: your official job is not to design the system yourself.
That’s your architect’s job. Or your tech lead’s.
However, in internal product management we are often faced with different constraints, including capacity. So things fall onto us. (That happened a lot to me.)
And even when they don’t, you still need to influence what’s being built to make sure it actually solves the problem.
Your job is to:
- Discovery of the business problem and impact
- User research
- Bring the requirements to the table
- Ask questions that surface assumptions
- Understand the trade-offs being made (and their business impact)
- Make sure the thing being designed actually solves the right problem
But if you also get good at system design, you’ll never again skip a design session because you couldn’t follow it anyway, and then ask later: “why didn’t they make it more flexible?” You actually help them understand in the session what “flexible” means for the product, and work through trade-offs together.
The design flow (5 steps)
System design happens before development starts. It’s the bridge between “here’s what we need to build” and “here’s how we’ll build it.”
Whether you’re sitting with your architect or just trying to understand your product better, use this flow:
Step 1: Clarify requirements
Split into two types:
- Functional: What does the system do? (User can place an order, track delivery, receive notification)
- Non-functional: How does it need to perform? (Available 99.9% of the time, handles 10,000 orders per hour)
Non-functional requirements are where most PMs go blank. Don’t skip them. They drive every major design decision.
Step 2: Estimate scale
Rough numbers only. No need to be exact.
- How many users per day?
- How many requests per second at peak?
- How much data is being stored and read?
Scale shapes whether you need one database or ten. Whether you cache aggressively or not at all.
Step 3: High-level design
Draw the big boxes. Who are the main components, and how do they talk to each other?
Don’t start with details. Start with: client → backend → database. Then add what’s in between.
Step 4: Dive deeper
Pick the hardest or most critical part of the system. Go one level deeper there.
You don’t need to deep-dive everything. Focus on what’s most likely to break or scale badly.
Step 5: Identify trade-offs
Every design decision gives up something.
Fast writes → maybe slower reads. More availability → maybe less consistency. Simpler architecture → harder to scale later.
Your job as PM is to understand what’s being traded and whether that’s the right call for the business.
Let’s design Wolt together
Wolt is a food delivery app here in Germany. I’m a regular user, so that’s why we’re designing it.
You open it, browse restaurants near you, place an order, track the driver in real-time, and pay in-app.
Simple on the surface. Interesting underneath.
Step 1: Requirements
Functional:
- Browse restaurants and menus
- Place and manage orders
- Real-time delivery tracking
- Payment processing
- Notifications (order confirmed, driver on the way, delivered)
Non-functional:
- High availability. If Wolt goes down at 7pm on a Friday, that’s a disaster
- Low latency for tracking. Location updates need to feel live
- Handle massive peak load (dinner time in every city at once)
Step 2: Scale estimate
Let’s say Wolt has 5 million active users across all markets. At dinner peak, maybe 200,000 orders per hour. That’s roughly 55 orders per second.
Each order triggers a chain: restaurant notification, driver assignment, payment, and continuous location updates every few seconds. That changes how we think about the design.
Step 3: High-level design
Here are the main pieces:
- Client (mobile app) sends requests
- API Gateway routes requests to the right service
- Order Service creates and manages orders
- Restaurant Service handles menus and restaurant availability
- Delivery Service assigns drivers and tracks location
- Payment Service processes transactions
- Notification Service sends push notifications and emails
- Databases store all of the above (often separate per service)
At this level, we’re not saying how any of this works. We’re saying what the system needs to do and who’s responsible for each part.
Step 4: Go deeper, real-time tracking
If you were a PM on Wolt’s team, you’ve probably read through hundreds of user complaints: “The delivery guy isn’t moving on the map, even though he clearly is.”
So how do we design for that?
The driver’s app sends a GPS ping every 3–5 seconds to the Delivery Service. That’s a lot of writes happening constantly, across thousands of active deliveries.
Two things have to work:
- The location data has to get stored fast (write-heavy load)
- Your app has to receive updates fast (real-time read)
Most systems use something like WebSockets here. It’s a persistent connection between your app and the server, so updates get pushed to you instantly rather than your app constantly asking “where’s the driver now?”
This is the kind of detail that helps you understand why “just add a tracking feature” isn’t a two-day ticket.
Step 5: Trade-offs
As a product manager, you want real-time location and you want the app to always work.
But sometimes you can’t have both. That’s what trade-offs are about.
One key trade-off in a system like this: consistency vs. availability.
- Consistency means you always see the exact, correct driver location
- Availability means the app always works, even under pressure
In a food delivery context, Wolt will likely prioritize availability. If your location is 10 seconds behind, that’s fine. If the app crashes at peak dinner time, that’s not.
That’s a product decision as much as a technical one. And it’s exactly the kind of conversation PMs should be part of.
The tool I use for diagramming
I use draw.io. It’s clean, shareable, and integrates well with the tools most enterprise teams already use.
Our enterprise version doesn’t have AI in it yet. But sometimes it’s good to use your own brain :)
If you want AI-assisted diagramming, there are two worth exploring:
- Eraser.io: describe a system in text and it generates a diagram. Great for a first draft.
- ChatGPT or Claude: ask it to generate a Mermaid diagram from your description, then paste it into any Mermaid-compatible viewer (like Notion). Free and surprisingly good.
Start with whatever gets you drawing.