muthu-cs / system-design

System Design Interview Preparation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

System Design Basics

  • Key Characteristics and Fundamentals of Distributed Systems
  • Monolithic VS Microservice (Service Discovery, Resiliency)
  • Vertical vs horizontal scaling Watch1
  • Load Balancing / Application Delivery Controller (ADC) Read1 Read2 Watch1
  • Consistent Hashing Watch1 Read1 Read2 Read3
  • Throughput, Latency
  • CAP theorem
  • ACID vs BASE
  • Redundancy and Replication
  • Partitioning/Sharding
  • Optimistic vs pessimistic locking
  • Strong vs eventual consistency
  • SQL vs NoSQL
  • Types of NoSQL (Key value, Wide column, Document-based, Graph-based)
  • Caching
  • Data center/racks/hosts
  • CPU/memory/Hard drives/Network bandwidth
  • Random vs sequential read/writes to disk
  • DNS lookup
  • HTTP, HTTPS, HTTP2
    • HTTP
    • HTTPS Read1
    • HTTP & SSL/TLS
    • Public key infrastructure and certificate authority(CA)
    • Symmetric vs asymmetric encryption
  • WebSockets
  • Long-Polling vs WebSockets vs Server-Sent Events
  • TCP/IP model
  • IPv4 vs IPv6
  • TCP vs UDP
  • Consistent Hashing
  • CDNs & Edges
  • Data Partitioning
  • Indexes
  • Master-Slave, Master-Master
  • Active-Passive, Active-Active
  • Leader election
  • Design patterns and Object-oriented design
  • Virtual machines and containers
  • Pub-sub architecture
  • REST, GraphQL
  • MapReduce
  • Bloom filters and Count-Min sketch
  • Paxos
  • Multithreading, locks, synchronization, CAS(compare and set)
  • Proxies

Building Blocks of Any Frequently Asked System Design Question

  • Authentication
    • JWT
    • OAUTH2
  • File / Media Upload
    • S3, Multiple Quality Files
  • WIP...

Tools and Technologies

  • Databases Comparison
  • Cassandra
  • MongoDB/Couchbase
  • RabbitMQ / Kafka / Pub-Sub comparison Comparison
  • Mysql / PostgreSQL
    • Scalability in Postgres
  • Redis / Memcached
  • InfluxDB [Suitable for TimeSeries, IoT data]
  • Zookeeper
  • NGINX
  • HAProxy
  • Solr, Elastic search
  • Amazon, EC2, S3
  • Docker, Kubernetes
  • Hadoop/Spark and HDFS
  • Eureka, Hysterix
  • Heroku / Azure DevOps
  • Jenkins CI/CD

System Design Problems (HLD + LLD)

  • TinyURL
  • Instagram | Photo hosting platform
  • Timeline | Newsfeed | Twitter
  • Dropbox | Google Drive
  • Whatsapp | Facebook Messenger NL GS Ref
  • MakeMyTrip | BookMyShow
  • Amazon | Flipkart
  • Youtube | Netflix NL
  • Uber | IRCTC
  • Swiggy | Zomato
  • Yelp | Nearby
  • Twitter Search
  • Google Search
  • SplitWise
  • Zerodha
  • API Rate Limiter
  • Web Crawler
  • Rate limiting system
  • Distributed cache
  • Typeahead Suggestion | Auto-complete system
  • Recommendation System
  • Design a tagging system like tags used in LinkedIn

Low Level Design Problems (Machine Coding Round) Reference

Engineering Blogs Ref

Other Useful Resources:

System Design Interview Approach Template

THINGS TO CONSIDER [5 min]

    (1) Features
    (2) API
    (3) Availability
    (4) Latency
    (5) Scalability
    (6) Durability
    (7) Class Diagram
    (8) Security and Privacy
    (9) Cost-effective

FEATURE EXPECTATIONS [5 min]

    (1) Use cases
    (2) Scenarios that will not be covered
    (3) Who will use
    (4) How many will use
    (5) Usage patterns

ESTIMATIONS [5 min]

    (1) Throughput (QPS for read and write queries)
    (2) Latency expected from the system (for read and write queries)
    (3) Read/Write ratio
    (4) Traffic estimates
            - Write (QPS, Volume of data)
            - Read  (QPS, Volume of data)
    (5) Storage estimates
    (6) Memory estimates
            - If we are using a cache, what is the kind of data we want to store in cache
            - How much RAM and how many machines do we need for us to achieve this ?
            - Amount of data you want to store in disk/ssd

DESIGN GOALS [5 min]

    (1) Latency and Throughput requirements
    (2) Consistency vs Availability  [Weak/strong/eventual => consistency | Failover/replication => availability]

HIGH LEVEL DESIGN [5-10 min]

    (1) APIs for Read/Write scenarios for crucial components
    (2) Database schema
    (3) Basic algorithm
    (4) High level design for Read/Write scenario

DEEP DIVE [15-20 min]

    (1) Scaling the algorithm
    (2) Scaling individual components: 
            -> Availability, Consistency and Scale story for each component
            -> Consistency and availability patterns
    #### Think about the following components, how they would fit in and how it would help
            a) DNS
            b) CDN [Push vs Pull]
            c) Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
            d) Reverse Proxy
            e) Application layer scaling [Microservices, Service Discovery]
            f) DB [RDBMS, NoSQL]
                    > RDBMS 
                        >> Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning
                    > NoSQL
                        >> Key-Value, Wide-Column, Graph, Document
                            Fast-lookups:
                            -------------
                                >>> RAM  [Bounded size] => Redis, Memcached
                                >>> AP [Unbounded size] => Cassandra, RIAK, Voldemort
                                >>> CP [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
            g) Caches
                    > Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
                    > Eviction policies:
                            >> Cache aside
                            >> Write through
                            >> Write behind
                            >> Refresh ahead
            h) Asynchronism
                    > Message queues
                    > Task queues
                    > Back pressure
            i) Communication
                    > TCP
                    > UDP
                    > REST
                    > RPC

JUSTIFY [5 min]

(1) Throughput of each layer
(2) Latency caused between each layer
(3) Overall latency justification

More Resources:

Credit:

About

System Design Interview Preparation