Photo by Souvik Banerjee on Unsplash
Systems Design 1 - Twitter
How would you design a social media app?
Let's take a look at a common systems design interview question as posted on https://igotanoffer.com/blogs/tech/system-design-interviews#file
For this question you'll typically be asked to design a specific app, such as Twitter, Instagram, etc. For this example, we’ll assume the interviewer asked you to design Twitter.
While it may seem obvious, the very first step is to ask enough questions in order to define the functional and non-functional requirements.
And please don't remind me that it's actually called X, I don't care. Only Elon does.
Most likely, after asking some questions, we would arrive at a list of features such as
Requirement Gathering
Requirements
Feed of posts/tweets
Post a post/tweet (while defining its visibility)
Comment on a tweet, vote on a tweet
Follow an account
Search for tweets
Moderation through moderators
As most of these interview questions will focus on scaling as well, let's define the non-functional requirements
Non-functional requirements
100M=10^8 users, @ 1 tweet per day per user on average => 10^8 tweets a day
Search quick across the whole dataset
Service globally available
Reading 100s of messages each day per user => 10^11 tweets read each day
So while the write load is high, the read load is even higher.
High Level Design
While we could ask even more questions about further features, this is sufficient to get started and design the high level system.
In this whole series I will be using the C4 Model approach to software design (https://c4model.com/), that I will quickly summarize here. The C4 model uses 4 levels of software design instead of putting everything in a single diagram:
System Context diagram: The scope is a single software system and this diagram shows how people interact with the software system in scope. A software system is the highest level of abstraction of software.
Container diagram: A container (not necessarily a docker container) is a single executable unit of the software system. The Container diagram shows the containers involved in the system context in scope.
Component diagram: Within each container we have components that we show in this diagram. Components are the next level of abstraction and are not executable by their own.
Code diagram: Here we define how different elements of code (classes, objects, functions, methods, DB tables) interact.
For our example of twitter we will focus some of the more imporant people interacting with the Twitter Software System.
System Context Diagram
Right now we have had to make no decisions about the architectural styles yet but in the next step we will need to.
The main architectural styles that are commonly used are
Micro-Services, Service-Oriented
Event-Driven
Layered
Data-Centric
Component-Based
Domain-Driven
I intentionally grouped the micro-services style and SOA together as they are similar enough and exist on a spectrum of size but are similar enough.
Let's go through the which styles make sense in this particular case.
A traditionally layered approach will not easily fit the scalability requirements as each layer will be hard to scale. If you look more closely within parts of the app, some layering design elements will appear, however. So a No.
A data-centric approach can work but would solely rely on the scalability of the database used. For a system with such a simple data model there are also no benefits to this approach. So a No.
A component-based approach improves how the code is organized but traditionally it's still executed as a single executable. There have been arguments made that code should be written without defining how it is later executed but that's a whole other topic and right now most people would understand component-based architecture to have a single executable and a single data-base. As such, they are not suitable at this scale. So a No.
A domain-driven approach can work at any scale as it does not by itself offer guidance on the specific implementation but addressing the complexity inherent in sophisticated software domains. Fundamentally twitter is actually not such a domain. It's rather simple and thus we don't have a strong reason to apply this technique here. So a No.
Let's talk about the Event-Driven and Micro-Services (SOA included) approach. In a traditional Micro-Services the part of the system responsible for the posts (tweets) would need to push the data to the part of the system responsible for generating the twitter feed. That coupling is unnecessary and can be reduced trough an Event Driven approach.
At parts of the system where integration is more important than separation, we will not use the Event Driven approach.
Container Diagram
By using DDD and separating how Commands, Events and Views fit into the containers to fulfill the requirements we have effectively arrived at an Event-Driven approach featuring CQRS (Command Query Responsibility Segregration). You can read more about CQRS at https://martinfowler.com/bliki/CQRS.html.
By separating the Post & Comments aspect of our system from the Feeds aspect, the read and write side are seggregated and the two different containers can be optimized separately. Post & Comments can now employ a database that is optimized for heavy reads while Feeds can be continuously be updated through events and optimized for heavy reads.
One of the techniques that we will need to employ at the scales in question is partitioning. Partitioning of the Feeds will be easy as each user can have a feed personalized for them. Everything is neatly separated. AI can be used to optimize the feed for achieving the longest time of stay of each user.
Partitioning Post & Comments will be much harder but generally high throughput can be achieved through putting load balancer and queues in front of the Command handling of this container.
The Post & Comments is optimized for writes through an API gateway (with load balancers) and a Queue.
While the feed container is optimized for reads and can be further optimized with caching.
This is it for now but more features can be added quite easily to this fundamental design.
Thanks for reading.