CAP Theorem with Tom the Prankster

Link: https://newsletter.systemdesigncodex.com/p/cap-theorem

Essential Elements: Consistency, Availability, Partition Tolerance.
Partition Tolerance: Must-have due to inherent unreliability in communication networks.
Choice between Consistency and Availability:
- Consistency: All nodes see the same data at the same time. Requires a trade-off with availability.
- Availability: Ensures that the system is always operational, but might compromise on having the latest data across all nodes.

The Inevitable Law Governing Software Design

Link: https://newsletter.systemdesigncodex.com/p/the-inevitable-law-governing-software-design

Basic Principle: The structure of a software system reflects the communication structure of its creating organization.
Example: In a company with inventory, invoicing, and shipping departments, the software will likely have separate systems for each, mirroring these divisions.
Implications:
- Software integration quality depends on how well these departments communicate.
- Better communication leads to more effective and integrated software modules.
Strategies for Addressing Conway’s Law:
1. Acknowledge It: Recognize its impact on software design.
2. Structure Teams Effectively: Place teams working on similar systems close to each other for better communication.
3. Avoid Dividing by Technology: Instead of splitting teams by tech layers (front end, back end), focus on business features for smoother collaboration.
4. Use Architectural Insights: Align team structures with desired software architecture, understanding that organizational decisions influence software design.

The Ingredients to Delicious Software

Link: https://newsletter.systemdesigncodex.com/p/the-ingredients-to-delicious-software

Scalability:
- Ability to handle increased workload efficiently.
- Important to identify the point where scaling becomes cost-ineffective.
Latency & Throughput:
- Latency: Time taken to respond to a request (e.g., time to serve a cheese sandwich).
- Throughput: Number of requests handled in a given time (e.g., serving multiple customers).
Availability and Consistency:
- Availability: Ability to operate despite issues (e.g., with one cook absent).
- Measured in 'nines' (e.g., 99.9% availability).
- Consistency: Synchronization of information across different parts of the system (e.g., order copies being in sync).

What happens when you type a URL into your browser?

Link: https://blog.bytebytego.com/p/what-happens-when-you-type-a-url

When you type a URL into your browser:

URL Parsing: The browser identifies the HTTP protocol, domain, path, and resource.
DNS Lookup: It searches for the IP address of the domain, checking various caches.
TCP Connection: Establishes a connection with the server.
HTTP Request: Sends a request for the specific resource.
Server Response: The server sends back the requested content.
Rendering: The browser displays the webpage.

How does CDN work?

Link: https://blog.bytebytego.com/p/how-does-cdn-work

Domain Name Lookup:
- Bob enters www.myshop.com in his browser.
- The browser checks the local DNS cache for the domain.
DNS Resolver:
- If not in the local cache, the browser contacts the DNS resolver (usually via the ISP).
Recursive Domain Resolution:
- The DNS resolver performs recursive resolution for www.myshop.com.
CDN Integration:
- Instead of pointing directly to the London server, the authoritative name server redirects to a CDN domain (www.myshop.cdn.com).
Load Balancer Query:
- The DNS resolver queries the CDN load balancer domain (www.myshop.lb.com).
Optimal Server Selection:
- The CDN load balancer selects the best CDN edge server based on factors like the user’s location and server load.
Content Delivery:
- The browser connects to the chosen CDN edge server to load content.
- The content includes static (e.g., images, videos) and dynamic elements.
- If content is not on the edge server, it's fetched from higher-level CDN servers or the origin server in London.
CDN Network:
- This process is part of a geographically distributed CDN network for efficient content delivery.

Time complexity

Link: https://newsletter.francofernando.com/p/time-complexity

Purpose: Time complexity evaluates how an algorithm's performance scales with the size of the input data.
Types of Complexity:
- Worst-Case Complexity: Maximum number of steps for any input of size n. Most commonly used as it provides guarantees about the algorithm's upper limit.
- Best-Case Complexity: Minimum number of steps for any input of size n.
- Average-Case Complexity: Average number of steps over all possible instances of input size n.
Big Oh Notation: Simplifies the expression of an algorithm's worst-case complexity by focusing on growth rates rather than precise step counts.
Common Complexity Classes:
- Constant - O(1): Time is independent of input size (e.g., adding two numbers).
- Logarithmic - O(log n): Each step cuts the problem size in half (e.g., binary search).
- Linear - O(n): Time grows linearly with input size (e.g., finding max in an array).
- Superlinear - O(n log n): Combines linear and logarithmic growth (e.g., Quicksort, Mergesort).
- Quadratic - O(n^2): Time grows with the square of input size (e.g., insertion sort).
- Cubic - O(n^3): Involves triple nested loops (e.g., certain dynamic programming algorithms).
- Exponential - O(c^n): Time doubles with each addition to input size (e.g., enumerating subsets).
- Factorial - O(n!): Time grows with the factorial of input size (e.g., generating permutations).

8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers

Link: https://newsletter.systemdesign.one/p/whatsapp-engineering

Single Responsibility Principle:
- Focus on core feature: Messaging.
- Avoided feature creep and unnecessary functionalities.
- Prioritized reliability above all.
Technology Stack:
- Chose Erlang for server functionalities due to its scalability and support for hot-loading.
- Erlang's efficient threading and context-switching mechanisms contributed to performance.
Utilizing Existing Solutions:
- Leveraged open-source solutions like Ejabberd, an Erlang-based messaging server.
- Customized existing solutions to fit specific needs.
- Integrated third-party services for functionalities like push notifications.
Cross-Cutting Concerns:
- Emphasized aspects like monitoring and alerting for service health.
- Implemented Continuous Integration and Continuous Delivery for software development.
Scalability Strategies:
- Adopted diagonal scaling, combining horizontal and vertical scaling methods.
- Ran servers on FreeBSD, optimized for handling millions of connections.
- Overprovisioned servers for handling traffic spikes and potential failures.
Continuous Improvement (Flywheel Effect):
- Regularly measured performance metrics to identify and eliminate bottlenecks.
- Maintained a cycle of continuous feedback and improvement.
Focus on Quality:
- Conducted load testing to identify and address single points of failure.
- Used simulated production traffic for realistic testing.
Small Team Size:
- Kept the engineering team small (32 engineers) to maintain efficiency and reduce communication overhead.

This Is How Quora Shards MySQL to Handle 13+ Terabytes

Link: https://newsletter.systemdesign.one/p/mysql-sharding

Vertical Sharding:
- Implementation: Separating tables into different servers (leader-follower model).
- Purpose: Enhances write scalability.
- Challenges: Replication lag, transactional limitations, and potential performance issues for large tables.
Horizontal Sharding:
- Reasons for Adoption: Addressing challenges with large tables such as schema changes and error risks.
- Approach: Splitting a logical table into multiple physical tables.
Key Decisions in Horizontal Sharding:
- Build vs. Buy: Opted to build their own sharding solution, reusing vertical sharding logic.
- Shard Level: Focused on sharding at the table level due to extensive use of secondary indexes.
- Sharding Method: Chose range-based partitioning, favoring common range queries.
- Metadata Management: Stored shard metadata in Apache Zookeeper.
- Database API: Modified to handle sharding columns and keys, enhancing security against SQL injections.
- Sharding Column Selection: Based on latency sensitivity and query per second (QPS) considerations.
- Cross-Shard Indexes: Used to optimize non-sharding column queries, though with potential performance and consistency trade-offs.
- Number of Shards: Maintained a lower count to reduce latency in non-sharding column queries.

Making Your Database Highly Available

Link: https://newsletter.systemdesigncodex.com/p/making-your-database-highly-available

Redundancy

Purpose: Ensure continuous database operation even if one server fails.
Not Backup: Unlike backups, redundancy involves running multiple active database instances.
Cost of Outage: Can be significantly high, averaging $7,900 per minute.
Redundancy Patterns:
- Active-Passive: One active server handles requests while others stand by.
- Active-Active: Multiple servers handle requests simultaneously.
- Multi-Active: An extension of Active-Active with more complex setups.

Isolation

Goal: Minimize disaster impact by physically separating database components.
Degrees of Separation:
- Server: Different servers in the same data center.
- Rack: Separate racks within a data center.
- Data-Center: Multiple data centers.
- Availability Zone: Distinct zones within a cloud provider's network.
- Region: Geographically dispersed locations.

How Rate Limiting Works?

Link: https://newsletter.systemdesigncodex.com/p/how-rate-limiting-works

How Rate Limiting Works:

Concept: Limits the number of requests sent to a server.
Implementation: A rate limiter is used to control traffic to servers or APIs.

Key Concepts

Limit: Maximum number of requests allowed in a set time frame (e.g., 600 requests per day).
Window: The duration for the limit, varying from seconds to days.
Identifier: A unique attribute (like User ID or IP address) to identify request senders.

Designing a Rate Limiter

Process:
1. Count Requests: Track the number of requests from a user or IP.
2. Limit Exceeded: If count exceeds the limit, block or restrict further requests.
Considerations:
- Storage of request counters.
- Rate limiting rules.
- Response strategy for blocked requests.
- Rule change implementation.
- Maintaining application performance.

System Components

Rate Limiter Component: Checks incoming requests against the rules and stored data (number of requests made).
Rules Engine: Defines the rate limiting rules.
Cache: Stores rate-limiting data for high throughput and low latency.
Response Handling:
- Allow request if within limit.
- Block request if over limit, typically with HTTP status code 429.

Improvements

Silent Drop: Fool attackers by silently dropping excess requests.
Cached Rules: Enhance performance with a cache for the rules engine and background updates for rule changes.

Caching

Link: https://newsletter.francofernando.com/p/caching

Concept of Caching

Purpose: Speeds up data access by storing data temporarily in a fast-access hardware or software layer.
Cache Hit: Data is found in the cache.
Cache Miss: Data is not in the cache and must be fetched from its original location.

Caching in Distributed Systems

Levels: Hardware, OS, front-end, web apps, databases, etc.
Roles:
- Reducing latency.
- Saving network requests.
- Storing results of resource-intensive operations.
- Avoiding repetitive operations.

Types of Caching

Application Caching: Integrated into app code, checks cache before database access. Examples: Memcached, Redis.
Database Caching: Built into databases, requires no code changes, optimizes data retrieval.

Considerations and Challenges

Cache Miss Rate: High miss rates can add more latency.
Stale Data: Ensuring cache data is up-to-date and relevant.

Caching Strategies

Cache Aside (Lazy Loading):
- Direct read from cache. If miss, read from DB and update cache.
- Advantages: Good for read-heavy workloads. Cache only stores necessary data.
- Disadvantages: Can serve stale data. Initial cache misses.
Read Through:
- Interact only with cache. Cache manages data fetching from DB.
- Simplifies app code but complicates cache implementation.
Write Through:
- Writes data to cache and DB simultaneously.
- Ensures data consistency. Higher write latency.
Write Back (Asynchronous Writing):
- Writes data to cache, then asynchronously to DB.
- Lower write latency. Good for write-heavy workloads.
Write Around:
- Writes directly to DB, cache only stores read data.
- Good for infrequently read data. Higher read latency for new data.

Choosing a Cache Strategy

Depends on data access patterns.
Cache-Around: Good for general-purpose, read-intensive applications.
Write-Heavy Workloads: Write-back approaches are beneficial.
Infrequent Reads: Write-around strategy.

Eviction Policies

Manage Limited Cache Space:
- FIFO: First in, first out.
- LIFO: Last in, first out.
- LRU: Least recently used.
- MRU: Most recently used.
- LFU: Least frequently used.
- RR: Random replacement.

Database Replication Under the Hood

Link: https://newsletter.systemdesigncodex.com/p/database-replication-under-the-hood

Statement-based Replication

How It Works: The leader logs every SQL write statement (INSERT, UPDATE, DELETE) and forwards these statements to follower nodes.
Advantages:
- Efficient in network bandwidth, only SQL statements are transferred.
- Portable across different database versions.
- Simpler to implement.
Limitations:
- Non-deterministic functions (e.g., NOW(), UUID()) yield different values on replicas.
- Transactions involving auto-incrementing columns must be executed in the same order.
- Potential unforeseen effects due to triggers or stored procedures.

Shipping the Write-Ahead Log (WAL)

Concept: The WAL, an append-only sequence of all writes, is shared with follower nodes.
Usage: Common in databases like PostgreSQL.
Advantage: Creates an exact replica of the leader’s data structures.
Disadvantage: Tightly coupled to the storage engine, making it less flexible with database version changes and hindering zero-downtime upgrades.

Row-Based Replication

Functionality: Uses a logical log showing writes in a row format.
Operation Details:
- Inserts log new values for all columns.
- Deletes log identifiers for deleted rows.
- Updates log identifiers and new values for modified columns.
Advantage: Decouples from the storage engine, allowing backward compatibility and version flexibility between leader and follower databases.

Choosing Replication Methods

The choice depends on the specific requirements of the system, such as:
- Network efficiency.
- Consistency requirements.
- Database version compatibility.
Statement-based Replication: Best for simple, less concurrent environments.
WAL Shipping: Suitable for systems where exact replica and data integrity are critical.
Row-Based Replication: Ideal for environments requiring flexibility and compatibility across different database versions.

Consistent Hashing

Link: https://newsletter.francofernando.com/p/consistent-hashing

Caching Servers

Use Case: Store frequently accessed data in fast, in-memory caches.
Hashing Role: Ensures identical requests are sent to the same server by hashing request attributes (IP, username, etc.).
Challenge: Maintaining effective caching when servers are added or removed.

Data Partitioning

Purpose: Distribute data across multiple database servers.
Hashing Function: Data keys are hashed to determine the server where data will be stored.
Limitation: Similar to caching, adding or removing servers complicates data distribution.

The Hashing Problem

Goal: Map keys (data identifiers or workload requests) to servers efficiently.
Desired Properties:
- Balancing: Equal distribution of keys among servers.
- Scalability: Easily adding or removing servers with minimal reconfiguration.
- Lookup Speed: Quickly finding the server for a given key.

Naïve Hashing Approach

Method: Number servers, use hash(key) % N to assign keys to servers.
Drawback: Not scalable. Changing server count (N) requires remapping all keys.

Consistent Hashing

Concept: Treat hash values as a circular space. Map keys and servers onto this circle.
Operation: Assign each key to the nearest server on the circle in a clockwise direction.
Advantages:
- Only a fraction of keys need remapping when adding/removing servers.
- Better scalability.
Issue: Does not guarantee even key distribution (balancing).

Virtual Nodes Solution

Strategy: Introduce replicas or virtual nodes for each server on the hash circle.
Benefits:
- Better balancing due to smaller ranges and more uniform key distribution.
- Faster rebalancing when servers are added or removed.
- Support for server fault tolerance and heterogeneity.
Implementation: Assign more virtual nodes to more powerful servers for load balancing.

Why Replication Lag Occurs in Databases

Link: https://newsletter.systemdesigncodex.com/p/why-replication-lag-occurs-in-databases

Concept: Replication Lag is the delay between a write operation on the leader node and its replication on follower nodes in a database system.
Leader-based Replication Setup:
- Writes are processed by a single node (leader).
- Read queries can be served by any replica (follower).
- Common in systems with more reads than writes.
Asynchronous vs. Synchronous Replication:
- Synchronous: All replicas must confirm write operations, causing potential unavailability if a replica is down.
- Asynchronous: Allows distribution of reads across followers, but can lead to outdated reads if a follower lags.
How Replication Lag Occurs:
1. User A updates data on the leader node.
2. Leader sends replication data to followers.
3. User B reads from a follower (replica 2) before it's updated, receiving outdated information.
4. Replica 2 eventually gets updated.
Implications:
- Lag duration varies from fractions of a second to minutes.
- Causes temporary data inconsistencies (eventual consistency).
- Large lags can significantly impact application performance.
Challenge: Managing replication lag to minimize data inconsistencies and ensure efficient operation.

Problems Caused by Database Replication

Link: https://newsletter.systemdesigncodex.com/p/problems-caused-by-db-replication

Vanishing Updates
- Scenario: User updates data on the leader node, but a subsequent read request to a lagging replica shows outdated data.
- Problem: User experiences frustration as their updates appear to vanish.
- Solution: Implement read-after-write consistency. Methods include:
  - Reading user-modified data from the leader.
  - Tracking recent writes with timestamps.
  - Monitoring and limiting queries on lagging replicas.
Going Backward in Time
- Issue: User sees an update (e.g., a new comment) and then it disappears upon refreshing, due to a lagging replica.
- User Experience: Confusion and inconsistency.
- Solution: Ensure Monotonic Reads.
  - Users always read from the same replica.
  - Use hashing based on User ID for replica selection.
Violation of Causality
- Problem: In sharded databases, replication lag causes sequence disorder in communication (e.g., a reply appears before the original message).
- Result: Appears as if cause and effect are reversed.
- Solution: Provide consistent prefix reads.
  - Ensures writes are read in the order they were made.

How Request Coalescing Works

Link: https://newsletter.systemdesigncodex.com/p/how-request-coalescing-works

Concept: Request Coalescing is a technique for optimizing database queries by reducing redundant requests for the same data.

Application: Successfully used by Discord to manage trillions of messages efficiently.

Functionality:

Setup: Involves intermediary data services between the API layer and the database.
Process:
- When the first request is made, a worker task is initiated in the data service.
- Subsequent requests for the same data subscribe to this existing task.
- The worker task queries the database once and returns the result to all subscribers simultaneously.

Differences from Caching:

Request Initiation: In request coalescing, only the first request triggers a database query. Subsequent ones wait for its result. In caching, all requests would hit the cache.
Use with Caching: Request coalescing can complement caching by reducing the number of hits to the cache.

Internal Working (Based on Discord's Implementation):

Each worker task maintains a local state with requests and a list of requesters.
Responses are propagated to all waiting requesters upon arrival.

Applicability:

Request Coalescing is particularly useful for systems with high concurrency and redundant requests.
The necessity of this technique depends on the scale and specific challenges of the system.

How to Migrate a MySQL Database

Link: https://newsletter.systemdesign.one/p/how-to-migrate-a-mysql-database

Context: Tumblr's MySQL database, spanning 21 terabytes and 60+ billion rows across 200+ servers, necessitated a migration strategy that minimizes user impact.

Challenges:

Maintaining high availability and scalability.
Minimizing downtime and user impact during migration.

Strategies Used:

CQRS Pattern (Command and Query Responsibility Segregation):
- Separated read and write operations for the database.
- Ensured continuous read availability during migration.
Leader-Follower Replication:
- Leader in a remote data center handled read-write operations.
- Local data center had followers for handling read requests.
- Used persistent connections to reduce latency issues.
Database Proxy (ProxySQL):
- Positioned in the local data center.
- Maintained persistent connections to the remote leader.
- Enabled connection pooling, improving performance and reducing disconnections.

Migration Process:

Preparation:
- Stored metadata of leaders, followers, and proxies in each data center.
Migration Execution:
- Shifted the database leader from Data Center A to B.
- Automated tools redirected followers and proxies to the new leader.
Outcome:
- Followers continued serving read requests.
- Write requests were briefly halted or buffered, resulting in minimal user impact.

Consideration for Further Improvement:

Leader-Leader Replication: Could enhance write availability but poses a risk of data conflicts.
Reason for Non-Use: Potential conflicts might be why Tumblr opted against this approach.

Durability

Link: https://newsletter.francofernando.com/p/durability

Core Objective: Durability, ensuring data is not lost despite failures like power outages, system crashes, or hardware issues.

Single-Node Database Persistence

Durability Method: Data is written to nonvolatile storage (hard drive, SSD).
Transaction Processing:
- Log Writing: Data first written to a log file before making actual data updates.
- Update Execution: After log entry, the database updates the actual data.
- Role of Log: Enables reprocessing of transactions to restore consistent state post-failure.
- Efficiency: Log writing is fast due to its append-only nature, minimizing seek time.

Distributed Database Persistence

Complexity: Higher due to the need for coordination across multiple servers.
Two-Phase Commit Protocol:
- Coordinator Role: A designated server coordinates the commit process.
- Process:
  - Coordinator sends commit instruction to all participant servers.
  - Waits for acknowledgments from all participants.
  - Finalizes the transaction with a commit or rollback based on responses.

Redis

Link: https://newsletter.francofernando.com/p/redis

Redis Overview:

Redis stands for REmote DIctionary Server.
It's an open-source, key-value database store.
Functions as a data structure server, supporting various data structures like Strings, Lists, Sets, Hashes, Sorted Sets, and HyperLogLogs.

History:

Created by Salvatore Sanfilippo in the late 2000s.
Developed to address scaling issues with MySQL in real-time analytics.
Gained popularity and wide adoption due to its efficiency and flexibility.

Operations and Data Types:

Basic operations include GET and SET.
Supports diverse data structures, each with specific use cases and operations.

Redis Architectures:

Single Instance: Simplest form, running on the same or a separate server.
Replicated Instances: Primary instance replicated across secondary instances for parallel read requests and backup.
Sentinel: Manages high availability, monitoring, and failure handling.
Cluster: Distributes data across multiple machines using sharding.

Data Persistency:

Offers two methods:
- RDB (Redis Database Backup): Snapshot-based backups.
- AOF (Append Only File): Logs every change for more recent backups.
Choice between RDB and AOF depends on the need for speed vs. data recency.

Single-thread Model:

Utilizes a single-threaded model for operations, avoiding multi-threading overhead.
Performance typically limited by memory and network, not CPU.

Use Cases:

Database: As a primary key-value store.
Cache: For storing frequent queries or caching API requests.
Pub/Sub: For scalable and fast messaging systems.

Salt and Pepper

Link: https://newsletter.francofernando.com/p/salt-and-pepper

1. Hashing

Method: Converts plain text passwords into a random string of characters.
Process: User's password is hashed and compared with the stored hash during login.
Common Algorithms: MD5, SHA family. However, these are vulnerable to rainbow table attacks.

2. Salting

Purpose: Enhances hashing by defending against pre-computation attacks like rainbow tables.
Implementation:
- Generate a unique salt for each password.
- Combine salt with the password and hash the result.
- Store the salt in plain text and the hashed password in the database.
Validation Process:
1. Retrieve the salt from the database.
2. Combine entered password with salt and hash.
3. Compare with stored hash for validation.
Uniqueness: Ensures each stored hash is unique, even for identical passwords.

3. Peppering

Function: Adds an extra layer of security to salting.
Mechanism:
- Add a pepper value to the password before hashing.
- The pepper is not stored in the database.
Login Process:
- Attempt combinations of password and pepper until a match is found.
Benefit: Significantly increases the effort required for brute force attacks.

Key Takeaways:

Combining Techniques: Using both salting and peppering provides robust protection.
Importance of Uniqueness: Unique salts and peppers make each hash distinct.
Updating Practices: Continuously update and improve password storage methods to counteract new hacking techniques.

Moving from Monolithic to Microservices

Link: https://newsletter.systemdesigncodex.com/p/from-monolithic-to-microservices

1. Modular Monolith Approach

Concept: Incorporates modular design within a monolithic architecture.
Characteristics:
- Loosely-coupled modules.
- Well-defined boundaries.
- Explicit dependencies.
Structure: Application divided into independent modules.
Deployment: Still maintains single application deployment.
Advantages:
- Streamlines development and maintenance.
- Offers microservices-like benefits without associated complexities.

2. Evolution to Vertical Slice Architecture

Design Shift: From horizontal layers to vertical slices of business functionality.
Benefits:
- Scoped changes to specific business areas.
- Easier feature addition and modification.
Microservices Potential: Vertical modules can gradually evolve into independent microservices.
Learning Opportunity: Provides insights into domain and functional splits.

Key Takeaway

Balance: No inherent superiority of microservices over monoliths or vice versa.
Evolutionary Approach: Adapt the architecture to evolving application needs.
Pragmatism: Choose the architecture that best suits the project's requirements and context.

The Secret Trick to High-Availability

Link: https://newsletter.systemdesigncodex.com/p/the-secret-trick-to-high-availability

Strategies for Static Stability

Active-Active High Availability:
- Implementation: Distribute traffic across instances in multiple Availability Zones (AZs).
- Example: If two instances are needed, create three (50% over-provisioning).
- Benefit: Maintains full capacity even if an entire AZ fails.
Active-Passive High Availability:
- Use Case: For stateful services like databases.
- Setup: Primary instance in one AZ and a standby in another.
- Function: Standby becomes primary if the original primary AZ goes down.

Criticism and Justification

Criticism: Viewed as resource wasteful due to over-provisioning.
Justification:
- Essential for mission-critical applications where downtime is unacceptable.
- Used by major cloud services like AWS (EC2, S3, RDS) to prevent outages.

Key Takeaway

Outages as a Norm: Disruptions are inevitable; planning for them is crucial.
Risk Management: Over-provisioning is a strategic choice to mitigate downtime risks.
Context-Dependent: The level of static stability required varies based on the system's criticality.

Static stability, while resource-intensive, is a fundamental approach for ensuring continuous operation in high-stake environments where reliability and uptime are non-negotiable.

4 Types of NoSQL Databases

Link: https://newsletter.systemdesigncodex.com/p/4-types-of-nosql-databases

1. Document Databases

Examples: MongoDB, Couchbase, RavenDB.
Data Storage: In the form of JSON, BSON, or XML documents.
Advantages: Align closely with domain-level data objects in applications.
Use Case: Ideal for projects requiring a structure close to application data.

2. Key-Value Store

Examples: Redis, etcd, DynamoDB.
Structure: Data stored as key-value pairs.
Simplicity: Resembles a two-column table (key and value).
Use Cases: Caching, shopping carts, user profiles.

3. Column-Oriented Database

Examples: Apache Cassandra, Apache HBase.
Storage Method: Data stored in columns rather than rows.
Advantages: Efficient for analytics and aggregations on specific columns.
Considerations: Not strongly consistent; write operations can be complex.

4. Graph Databases

Examples: Neo4j, Amazon Neptune.
Concept: Focuses on relationships between data elements (nodes and links).
Strengths: Eliminates the need for multiple table joins as in SQL databases.
Use Cases: Knowledge graphs, social networks, map-like applications.

Decision Guide:

Document DBs: Versatile, suitable for most applications traditionally using SQL.
Key-Value Stores: For applications requiring fast read/write access to data items.
Column-Oriented: Analytics and operations on large datasets.
Graph Databases: Applications where relationships are central to the data model.

CAP Theorem with Tom the Prankster

The Inevitable Law Governing Software Design

The Ingredients to Delicious Software

What happens when you type a URL into your browser?

How does CDN work?

Time complexity

8 Reasons Why WhatsApp Was Able to Support 50 Billion Messages a Day With Only 32 Engineers

This Is How Quora Shards MySQL to Handle 13+ Terabytes

Making Your Database Highly Available

Redundancy

Isolation

How Rate Limiting Works?

How Rate Limiting Works:

Key Concepts

Designing a Rate Limiter

System Components

Improvements

Caching

Concept of Caching

Caching in Distributed Systems

Types of Caching

Considerations and Challenges

Caching Strategies

Choosing a Cache Strategy

Eviction Policies

Database Replication Under the Hood

Statement-based Replication

Shipping the Write-Ahead Log (WAL)

Row-Based Replication

Choosing Replication Methods

Consistent Hashing

Caching Servers

Data Partitioning

The Hashing Problem

Naïve Hashing Approach

Consistent Hashing

Virtual Nodes Solution

Why Replication Lag Occurs in Databases

Problems Caused by Database Replication

How Request Coalescing Works

How to Migrate a MySQL Database

Durability

Single-Node Database Persistence

Distributed Database Persistence

Redis

Salt and Pepper

1. Hashing

2. Salting

3. Peppering

Moving from Monolithic to Microservices

1. Modular Monolith Approach

2. Evolution to Vertical Slice Architecture

Key Takeaway

The Secret Trick to High-Availability

Strategies for Static Stability

Criticism and Justification

Key Takeaway

4 Types of NoSQL Databases

1. Document Databases

2. Key-Value Store

3. Column-Oriented Database

4. Graph Databases

Decision Guide:

About