Summary of the contents lectured in 'Cloud and Service Oriented Computing', a course from the Master in Software Engineering @FEUP.
- Introduction
- Designing an application with a microservice architecture - Part I
- Designing an application with a microservice architecture - Part II
- Inter-Service Communication - Part I
- Inter-Service Communication - Part II
- Software Architecture Patterns
- Integrating Services - Part I
- Integrating Services - Part II
- Computing and storage as a service
- Computing and storage resources providing an application platform as a service
- Utility Pricing
- Elastic Resource capability
- Virtualized Resources
- Management Automation
- Self-service provisioning
- ...
- Computing and storage resources providing an application platform as a service
- Different types of services can be offered, namely:
- Infrastructure as a Service (IaaS)
- Offer computing infrastructures (e.g virtual machines)
- High-level APIs used to dereference various low-level details of underlying network infrastructure
- Hypervisor (Virtual Machine Monitor) is responsible for loading virtual machines
- Other examples of IAAS: disk-image library, firewalls, loadbalancers, VLANs, software bundles
- Platform as a Service (PaaS)
- Platform allowing customers to develop, run, and manage applications
- Discards complexity of building and maintaining the infrastructure
- Software deployment controlled with minimal configuration options
- Provider provides the networks, servers, storage, operating system (OS), middleware (e.g. Java runtime, .NET runtime), database and other services to host the consumer's application.
- Software as a Service (SaaS) aka "on-demand software"
- Access to application software and databases over the Internet
- Providers manage the infrastructure and platforms that run the applications
- Usually priced on a pay-per-use basis or using a subscription fee
- Function as a Service (FaaS)
- Platform allowing customers to develop, run, and manage applications
- Complete abstraction of servers away from the developer
- FaaS vs PaaS
- PaaS: deploy an entire application
- FaaS: deploy what is essentially a single function, or part of an application
- FaaS: designed to potentially be a serverless architecture
- Infrastructure as a Service (IaaS)
- Serveless Computing
- Provider dynamically manages the allocation of machine resources
- Pricing based on the actual amount of resources consumed by an application
- Server management and capacity planning decisions are completely hidden from the developer
- Can be used in conjunction with code deployed in traditional styles, such as micro-services
- Types of cloud
- Cloud Features
- Elasticity
- Self-managing system. Users only inputs the desired policies
- Provides agility and adaptability to environment changes
- Implies horizontal and vertical scalabilities
- Horizontal: adding more machines/ resources
- Vertical: adding more power (e.g.: CPU, RAM) to existing machines
- Reliability and Availability
- Ensures constant operation through redundant resource usage (e.g.: fault tolerance)
- Loag-balancing -> Ability to deal with increasing concurrent access
- Quality of Service
- Services meet users requirements (e.g.: response time)
- Pay per use
- Services sold as Utility Computing
- Costs according to actual resource consumption
- Going Green
- Reduce energy consumption -> Reduce costs & carbon footprint
- Elasticity
- Virtualization is essential in the Cloud
- Provides all the cloud features (e.g.: ease of use, flexibility and adaptability, location independence, etc.)
- Serverless (or Functions as a Service (FaaS)) is the culmination of several iterations
- The evolution began with physical metal in the data center and progressed through Infrastructure as a Service (IaaS) and Platform as a Service (PaaS).
- Before the cloud, to deploy one had to answer:
- What hardware should be installed?
- How is the physical access to the machine secured?
- Where are storage backups sent?
- ...
1. IaaS
- Still requires heavy overhead because staff are still responsible for various tasks
- Patching and backing up servers;
- Installing packages;
- Keeping the operating system up-to-date;
- Monitoring the application.
2. PaaS
- reduces the overhead
- cloud provider handles operating systems, security patches, and even the required packages to support a specific platform
- Instead of building VM, developers now user "platform targets"
- Questions are reduced to:
- What size services are needed?
- How do the services scale horizontally?
- And vertically?
3. Serverless
- Abstracts servers by focusing on event- driven code.
- Developers focus on a microservice that does one thing (instead of platform)
- Questions are:
- What triggers the code?
- What does the code do?
- Billing
- IaaS and PaaS
- Pay to host the endpoints even when they aren't being accessed
- Serverless
- micro-billing
- scale each endpoint independently
- pay for usage
- no costs are incurred when the APIs aren't being called
- micro-billing
- IaaS and PaaS
- Key idea
- Application as a set of services instead of one large application
- A service is a standalone, independently deployable software component that implements some useful functionality
- Each service is deployed separately and they communicate through well-defined network-based interfaces
- Hexagonal architecture style
- Alternative to the layered architectural style (UI Logic -> Business Logic -> Data Access Layer)
- Puts the business logic at the center
- Instead of the UI layer, the application has one or more inbound adapters that handle requests from outside and invoke the business logic (center of the hexagon)
- Business logic independent of the adapters
- Decoples business logic from UI and data acess logic in the adapters
- Desinigning with microservice architecture
- Identify system operations -> Identify services -> Define APIs and collaborations
- Identify system operations
- Identify the application's requirements (aka User Stories and associated user scenarios) -> macro-architecture
- A requirement / external request will map to a system operation
- A system operation is an abstraction of a request that the application must handle
- Can be a command -> update data (create, update, delete)
- specified by the parameters, return value and behaviour
- Behaviour:
- Specifies the preconditions that must be true before invoke the operation
- Specifies post-conditions that are true after invoking the operation
- Can be a query -> retrivies data (get)
- Can be a command -> update data (create, update, delete)
- Two steps:
- Create a high-level domain model (identify key classes)
- nouns of the user stories
- simpler than the fina implementation
- The application won’t even have a single domain model because each service has its own domain model
- Useful for defining vocabulary for describing behaviour / system operations
- Describe operations using those classes
- verbs of the user stories
- Create a high-level domain model (identify key classes)
- Decomposition strategies
- by business capability pattern
- Organizes services around business capabilities
- by subdomain pattern
- Organizes services around domain-driven design (DDD) subdomains
- by business capability pattern
- Decompose by business capability
- A business capability is something that a business does in order to generate value.
- Business capabilities define what an organization does independently of how it does.
- Capabilities are stable over time
- How capabilities are executed may change over time
- e.g.: deposit that previosuly was in checks now is ATM, bu it is still a deposit
- Based on:
- Organization purpose
- Structure
- Business processes
- In our project we followed this decomposition
- From business capabilities to services
- Decompose by subdomain pattern
- DDD - domain driven design
- Aligned with the internal company organization
- Domain is related to the business
- One department/sub-domain -> One Service
- DDD concepts
- Subdomains
- A department can have multiple sections, and each section may be a sub-domain
- Bounded contexts (scope)
- Explicit boundary within which a domain model exists
- DDD vs global business modelling
- No single model for the entire business
- The domain model is private to the sub-domain and other sub-domains do not have to agree with the model developed.
- Subdomains
- DDD - domain driven design
- Notes
- A good design will scope out one microservice to a single bounded context
- The SOA (Service Oriented Architecture) approach would model the enterprise as a whole
- Communication between microservices can happen via events:
- Events are triggered as a result of state changes in bounded contexts
- Like we did in the project
- Decomposition guidelines
- Single Responsibility Principle
- A class/ service should have only one reason to change
- If a class/ service has multiple responsibilities that change independently, it won’t be stable
- Common Closure Principle
- The classes in a package should be closed together against the same kinds of changes. A change that affects a package affects all the classes in the package
- Two classes change in lockstep -> same package
- Improves the maintainability of application
- Single Responsibility Principle
- Decomposition obstacles
- Network latency
- Too many messages between two services
- Reduced availability due to synchronous communication
- REST is synchronous and may become a bottleneck
- Maintaining data consistency across services
- Some system operations need to update data in multiple services.
- atomic updates -> data reside within a single service
- Obtaining a consistent view of the data
- Although each service’s database is consistent, we can’t obtain a globally consistent view of the data.
- Consistent view needed -> must reside in a seginle service
- God classes preventing decomposition:
- Classes that are used throughout an application
- Tipically implements business logic for many different aspects of the application
- Normally has a large number of fields mapped to a database table with many columns
- Central concept in the application domain
- DDD solution
- Treat each service as a separate sub-domain with its own domain model (i.e.: a version of the God class).
- e.g.: an Order for the kitchen is a ticket, for the delivery an adress, ...
- Network latency
- Operations
- A system operation
- Collaborative operation between services
- Events
- State changes
- Notifications
- Step 1
- Assigning System Operations to Services
- Which service is the initial entry point for a request?
- Assign an operation to a service that needs the information provided by the operation
- Assign an operation to the service that has the information required to handle it
- Assigning System Operations to Services
- Step 2
- Defining the APIs required to support collaboration between services
- Which services will I need to collab with?
- Defining the APIs required to support collaboration between services
- Context map
- Notes
- Each service should represent a Business Logic (Is it not this decomposition by BC? Confirm with them)
- A data service is probably a bad design
- High Cohesion and Loose Coupling
- AVOID: one microservice to address two or more unrelated problems
- Highly cohesive system is naturally loosely coupled. Coupling is a measure of the interdependence between different microservices
- Resilience
- Measure of the capacity of a system / individual components in a system to recover quickly from a failure
- Doesn’t cause the entire system to fail
- Implementation:
- Timeouts for calls over the network
- Circuit Breaker
- microservice keeps timing out against one endpoint all the time -> no point keep trying, at least for some time -> wrapper circuit breaker does this
- automatic error responses for services exceeding a failure threshold in the recent past
- Observability
- combination of monitoring, distributed logging, distributed tracing, and visualization of a service’s runtime behavior and dependencies
- May track throughput of each microservice, the number of success/failed requests, utilization of CPU, memory, latency and some business-related metrics
- How to achieve?
- Logging
- record events
- Metrics
- latency, ... obtained by processing the logged events
- Tracing
- consider the event logs ordering
- Allows to trace a problem (e.g. high latency)
- Logging
- Automation
- rational behind a microservice architecture -> less time to production and shorter feedback cycles
- Two categories
- Continuous integration
- Focus on maintaining maintain source code integrity
- Continuous deployment
- bundle applications, infrastructure, middleware, and the supporting installation processes and dependencies into release packages
- Continuous integration
- Services are autonomaus
- Services communicate over the network
- A service based application can be considered a distributed system running multiple services on different network locations.
- Communication types:
- Synchronous
- The client sends a request and waits for response from the service. Both parties have to keep the connection open until response arrives.
- Can be a non-blocking IO implementation, using callbacks
- Asynchronous
- Send message and proceed without waiting for response
- Synchronous
- REST (Representational State Transfer)
- Uses a navigational scheme to represent objects and services over a network, known as resources.
- Not protocol dependent
- With the HTTP protocol, a resource is accessed using a unique URL and the standard HTTP operations GET, PUT, DELETE, POST, and HEAD
- Stateless servers.
- RPC (Remote Procedure Calls)
- Key objective
- Make the process of executing code on a remote machine as simple and straightforward as calling a local function
- Lost popularity due to complexity and performance
- gRPC
- Developed by Google -> now Open Source
- Uses Protocol Buffers: a language-neutral, platform-neutral extensible mechanism for serializing structured data
- Interface Definition Language (IDL) describe both the service interface and the structure of the payload messages
- USes server-side skeletions and client-side stubs to invoke the service
- Uses HTTP2 as the transport protocol -> key reason for the success and wide adaptation of gRPC
- Advantages
- Multiple requests in the same open connection
- Lower overhead due to less redundancy over several requests (e.g. Cookies)
- Avoids header repetition -> introduces header compression to optimize bandwidth use
- Advantages
- Key objective
- REST - Richardson Maturity Model
- Level 0
- Not considered RESTful at all
- Single URL for all resources and the content decides the operation
- Single HTTP method (in most cases, POST)
- SOAP web services are of this kind
- Level 1 - Resource URIs
- Has individual URIs for each resource, but
- The message still contains operation details
- Level 2 - HTTP verbs
- HTTP verbs to specifiy operation
- RESTful service consider to be a proper REST API
- Level 3 - Hypermedia Controls
- Service responses have links that control the application state for the client (Hypertext as The Engine of Application State
- Hypermedia controls tell us what we can do next, and the URI of the resource we need to manipulate to do it
- Level 0
- REST - Message formats
- JSON
- GraphQL
- Improves the REST model by allowing to retrieve multiple data in a single call
- JSON format
- Netflix Falcor provides similar function
- Allows the services to be more autonomous
- The client does not wait for a response
- The client may note receive a response at all or the response will be received asynchronously via a different channel
- Middleware: lightweight and dumb message broker
- There is no business logic in the broker and it is a centralized entity with high-availability
- Message Protocols
- JSON
- XML
- Apache AVRO
- Compact, fast, binary data format
- Messaging styles
- Single receiver
- A given message is reliably delivered from a producer to exactly one consumer through a message broker
- Multiple receivers
- Message produced by a single producer is delivered to more than one consumer
- Publisher-subscriber pattern (pub-sub)
- AMQP-based brokers support pub-sub messaging, e.g. RabbitMQ
- Kafka is the most widely used broker for pub-sub messaging between microservices
- Single receiver
- AMQP - Advanced Message Queuing Protocol
- Protocol for interoperability between all messaging middleware
- Ensures reliability of message delivery, fast and message acknowledgements
- When a message is delivered to a consumer, the consumer notifies the broker
- The broker will only completely remove a message from a queue when it receives a notification for that message
- The queue ensures the ordered delivery and processing of the messages
- AMQP Message Brokers: Software that implements the protocol (RabbitMQ, ActiveMQ, ...)
- e.g.
- Features
- heartbeat / healtcheck
- To ensure that the application layer promptly finds out about disrupted connections and completely unresponsive peers
- Broker failures
- AMQP standard defines a concept of durability for exchanges, queues, and persistent messages, requiring that a durable object or persistent message survive a restart
- Producer failures
- Retransmit any messages for which an acknowledgement has not been received from the broker
- Possibility of message duplication: consumer applications need to be implement in a way that internal state doesn’t change even if the same message is processed multiple times.
- heartbeat / healtcheck
- RabbitMQ
- Kafka
- Distributed pub-sub messaging system, designed for high volume messages
- Has its own messaging protocol
- Data is stored durably, in order and can be read deterministically
- Data is distributed for failover and scalability
- Unit of data is a message (key, value, timestamp)
- Value is an array of bytes
- Messages are organized in Topics
- Topics may be split into multiple partitions
- A producer can select a partition by using a key
- Topics may be split into multiple partitions
- Partitions are the primary mechanism in Kafka for parallelizing consumption and scaling a topic beyond the throughput limits of a single node
- Each partition can be hosted in different nodes
- No need to specify a partition to write, by default
- Dumb broker and assumes smart consumers to read its buffers
- Use cases:
- Architecure Evolution
- Monolith
- SOA (2000s)
- Microservices (2010s)
- Contained in a single deployment
- Everything, from user interface to database calls, is included in the same codebase
- Good for relatively small applications
- Advantages
- Easier to pull down a single code base and start working
- Ramp up time may be less
- Creating test environments is as simple as providing a new copy
- It may be designed to include multiple components and applications
- Disadvantages
- Difficult to work in parallel in the same code base
- Any change, no matter how trivial, requires deploying a new version of the entire application
- Refactoring potentially impacts the entire application -> tight coupling
- Often the only solution to scale is to create multiple, resource-intensive copies of the monolith
- Integration can be difficult
- Difficulty to test due to the need to configure the entire monolith
- Code reuse is challenging and often other apps end up having their own copies of code
- Hard to apply agile development
- Single point of failure
- Difficult to adopt new technologies and frameworks, as all the functionalities have to build on homogeneous technologies/ frameworks
- N-Layer applications
- Partition application logic into specific layers
- Most common layers
- UI Layer
- Business Logic Layer
- Data access Layers
- Advantages
- Refactoring is isolated to a layer
- Teams can independently build, test, deploy, and maintain separate layers
- Layers can be swapped out
- Visual Schema
- Applications are composed of more loosely coupled components that use a messaging bus to communicate between themselves
- Services -> reusable, loosely coupled entities
- Self-contained implementation of a well-defined business functionality
- Acessible via calls over the network
- Software components with well-defined interfaces that are implementation-independent. Separation of the interface (the what) from its implementation (the how).
- Consumers are only concerned about the service interface and do not care about its implementation.
- Composite services can be built from aggregates of other services.
- Deployed inside an application server
- Requires an additional layer - Enterprise Service Bus (ESB)
- Integrates business capabilities (product, customer, ...) -> creates composite business capabilities, exposed to the consumers
- Contains a significant portion of the business logic of the entire application
- Monolithic entity where all developers share the same runtime to develop/deploy their service integrations.
- Smart Pipes
- API Gateway
- Difficulty in interact with SOAP, which leads to
- Layer on top of the existing SOA implementations
- Known as the API façade
- Exposes a simple APIfor a given business functionality and hides all the internal complexities of the ESB/Web Services layer
- Also used for security, throttling, caching and monetization
- Visual Schema
- Independent application services delivering one single functionality in a loosely connected and self-contained fashion, communicating through a light-weight protocol (e.g. HTTP, REST, ...)
- More details in previous chapters
- Visual schema
- Characteristics
- Business Capability Oriented
- Service a specific business purpose a well-defined set of responsibilities
- Each service does only one thing and does it well
- SOA weill have more generic services, while here we have fine-grained services
- Autonomous: Develop, Deploy, and Scale Independently
- Microservices are developed, deployed, and scaled as independent entities
- Services do not share the same execution time
- Increases system resilience due to isolation of failures to service level;
- Can scale microservices according to each microservice traffic
- Elimination of the central ESB by breaking its functionalities into each service
- Services take care of the inter-service communication and composition logic
- Using smart endpoints and (dumb pipes or lightweight protocols like REST)
- Smart endpoints
- All business logic resides at micro service level
- Dumb pipes
- Only route messages
- Zero business logic
- Smart endpoints
- Using smart endpoints and (dumb pipes or lightweight protocols like REST)
- Failure tolerance
- One microsrevice crashes -> only it collapses
- Need to apply all the resiliency-related capabilities, such as circuit breakers, disaster recovery, load- balancing, fail-over, and dynamic scaling based on traffic patterns
- Decentralized data management
- Each microservice has its own database
- Several databases might need to be updated as a consequence of a single API request
- Microservices update each other using the APIs
- A microservice can only access its own database
- Service Governance
- Decentralizede process -> each entity govens its own domain
- In -soa this concepts are discarded
- Design-time governance of services
- Technologies, protocols, ...
- Runtime governance
- Service definitions, service registry and discovery, service versioning, service runtime dependencies, service ownerships and consumers, enforcing QoS, and service observerability
- Decentralizede process -> each entity govens its own domain
- Service Observerability
- combination of monitoring, distributed logging, distributed tracing, and visualization of a service’s runtime behavior and dependencies (as seen in previous chapters)
- Business Capability Oriented
- In a monolith approach, a request may be satisfied with a single call
- With microservices, we may need to access multiple services to satisfy the same request
- External access has higher latency
- Invoking the services directly (e.g. web service calling microservices) have the following problems:
- Multiple requests to retrieve the data needed
- inefficient
- can result in a poor user experience;
- The lack of encapsulation
- caused by clients knowing about each service and its API
- difficult to change the architecture and the APIs
- developers might change an API in a way that breaks existing clients
- Updating client’s app is more cumbersome
- Services might use communication mechanisms that aren’t convenient or practical for clients to use
- Multiple requests to retrieve the data needed
- Entry point service into the microservices-based application from external API clients
- Encapsulates the application’s internal architecture and provides an API to its clients
- May also have other responsibilities, such as authentication and monitoring
- Visual impact
- Responsible for
- Request routing (to a service)
- API composition (invokes multiple services)
- Protocol translation
- Client protocol may be different from the services (e.g. REST and RPC)
- May have 3 API modules
- Mobile API
- API for the mobile client
- Browser API
- API for the Js app runinng on the browser
- Public API
- API for third party developers
- Mobile API
- Why provide each client with its own API?
- e.g. -> mobile apps present less information than browsers
- Higher reliability
- Independently scalable
- Architecture Disadvantages
- yet another highly available component that must be developed, deployed, and managed
- risk that the API gateway becomes a development bottleneck
- Developers must update the API gateway to expose their services’s API
- The process for updating the API gateway must be as lightweight as possible
- Design issues
- Performance and scalability
- Every external request goes through the gateway
- Synchronous I/O model
- Each connection is handle by a dedicated thread
- OS threads are heavyweight
- Limit on the number of threads
- Asynchronous (nonblocking) I/O model
- Single thread dispatches I/O requests to event handlers
- More scalable but more complex programming
- Writing maintainable code by using reactive programming abstractions
- Calling services sequentially
- The response time is the sum of all services response times
- If there is no dependencies among services, all services should be called concurrently
- Solution: REACTIVE APPROACH
- Java 8 CompletableFutures
- RxJava (Reactive Extensions for Java) Observables, created by Netflix specifically to solve this problem
- Scala Futures
- JavaScript promises, RxJS (for NodeJS)
- Solution: REACTIVE APPROACH
- Calling services sequentially
- Handling partial failure
- Properly handle failed requests and requests that have unacceptably high latency.
- Circuit breaker pattern when invoking services
- Properly handle failed requests and requests that have unacceptably high latency.
- Performance and scalability
- measure of the capacity of a system or individual components in a system to recover quickly from a failure
- attribute of a system that enables it to deal with failure in a way that doesn’t cause the entire system to fail (other definition)
- microservices architecture is naturally a distributed system
- collection of computes (or nodes) connected over the network, with no shared memory, that appears to its users as a single coherent system
- Network will always be unreliable
- Patterns
- Timeout
- deciding when to stop waiting for a response at the caller service level
- After that time some action is taken
- Isolate failures
- Isolate failures
- Service A (integrator) calls services X and Y
- A timeout should be defined separately for X and Y
- A failure of another service does not have to become your service’s problem
- Service A (integrator) calls services X and Y
- Isolate failures
- Circuit Breaker
- Wrap the invocation with an object that monitors and prevents further damage to the system
- If the service invocation fails repeatedly and reaches a certain threshold, then the circuit breaker wrapper prevents any further invocation by the external service
- After a certain period of time (circuit reset timeout), a new request is allowed (Half Open state). If it succeeds the breaker goes to the closed state (allows requests), and to the open state (does not allow requests) otherwise
- Prevents cascade failures
- Visual schema
- Fail fast
- detect a failure as quick as possible
- Concept
- a failure response is much better than a slow failure response
- Bulkhead
- Application is partitioned so that an error that occurs in a partition is localized to that one partition only, avoiding leading the entire system to a fail state
- Procedure
- Group similar operations into a microservice
- Independent business functionalities are implemented as separate microservices
- Following the microservices architecture ensures this pattern
- Load Balancing
- Distribute the load across multiple microservice instances
- Kubernetes (e.g.) provides this capability
- Failover
- Reroute requests to alternate services if a given service fails
- Kubernetes (e.g.) provides this capability
- Let it crash
- If service is unstable and recovery difficult -> start new server instance
- Having a service per host and a rapid server startup time is crucial (container based)
- Timeout
- Includes
- Monitoring
- Logging
- Tracing
- Visualization
- Aim
- be alerted if there is a problem, such as a failed service instance or a disk filling up
- ideally before it impacts a user
- in case of error
- Be able to troubleshoot and identify the problem's source
- Patterns
- Health check API
- Log aggregation
- Aggregate the logs of all services in a centralized database that supports searching and alerting
- Aggregate logs
- Store them
- Allow user searches
- When a request involves more than one service, the log aggregation allows to have all log information, related to that query, centralized
- Alerts can be configured to be triggered when log entries match a given search criteria
- Popular infrastructures
- ElasticSearch
- Logstash
- AWS CloudWatch Logs
- ...
- Aggregate the logs of all services in a centralized database that supports searching and alerting
- Distributed tracing pattern
- Assign each external request a unique ID and record how it flows through the system from one service to the next in a centralized server that provides visualization and analysis
- Makes uses of the logs
- A trace represents an external request and consists of one or more spans.
- A span represents an operation
- key attributes
- operation name
- start timestamp
- end time
- Can have one or more child spans
- represent nested operations
- key attributes
- Assign each external request a unique ID and record how it flows through the system from one service to the next in a centralized server that provides visualization and analysis
- Application metrics
- Services report metrics to a central server that provides aggregation, visualization, and alerting
- Metrics to provide critical information about the health of an application
- Infrastructure-level: CPU, memory, disk utilization
- Application-level: service request latency and number of requests executed
- Metrics to provide critical information about the health of an application
- Push model
- A service instance sends the metrics to the Metrics Service by invoking an API
- e.g. AWS Cloudwatch metrics
- A service instance sends the metrics to the Metrics Service by invoking an API
- Pull model
- Metrics Service invokes a service API to retrieve the metrics from the service instance
- e.g. Promotheus
- Metrics Service invokes a service API to retrieve the metrics from the service instance
- Services report metrics to a central server that provides aggregation, visualization, and alerting
- Exception tracking
- Services report exceptions to a central service
- de-duplicates exceptions
- generates alerts
- manages the resolution of exceptions
- Exception might be a symptom of a failure or a programming bug
- Service rarely log an exception
- when it does, it is important to identify the root cause.
- Services report exceptions to a central service
- Audit logging
- Record user actions in a database
- Objectives
- help customer support
- ensure compliance
- detect suspicious behavior
- Objectives
- Each audit log entry records:
- identity of the user
- action they performed
- the business object(s)
- Record user actions in a database
- Includes
- Authentication
- Verifies the identity
- typically verifies credentials, such as a user ID and password
- Verifies the identity
- Authorization
- Verify if allowed to perform the requested operation on the specified data
- Applications often use a combination of role-based security and access control lists (ACLs).
- Auditing (Repeated from previous chapter)
- Tracking operations performed
- Objectives
- help customer support
- ensure compliance
- detect suspicious behavior
- Objectives
- Tracking operations performed
- Secure interprocess communication
- Ideally,all communication in and out of services should be over Transport Layer Security (TLS)
- Interservice communication may need authentication
- Authentication
- Session Token
- Stores ID and roles of the user
- Security context
- Established by the system based on the Session Token
- Request handlers retrieve user information from the security context
- Client has to send session token on all requests
- Monolithic vs Microservices
- Monolithic
- In-memory security context
- Share context among several requests
- Centralized session
- Store session information in a database
- In-memory security context
- Microservices
- Services do not share memory
- Central database not compliant with microservices
- Monolithic
- Handling authentication in the API gateway
- Avoids authentication by all services
- Centralizes the security issues in a single point
- Handling Authorization
- Should be handled by the services
- Otherwise the API Gateway becomes to coupled
- API Gateway can only apply role-base authorization to URL paths
- Should be handled by the services
- Token Types
- UUID - universally unique identifier
- requires an asynchronous RPC call by the service to a security server in order to retrieve user information
- JWT - Json Web Token
- Transparent token
- Contains the user information, ID and Roles, and expiration date
- signed by the creator (API Gateway) and decoded by the recipient (service) using a pair of keys
- UUID - universally unique identifier
- OAuth 2.0 standard protocol
- Authorization protocol that was originally designed to enable a user of a public cloud service, such as GitHub or Google, to grant a third-party application access to its information without revealing its password
- Can be used for authentication and authorization in a microservice application
- Uses HTTPS for communication between the client and the authorization server
- Key concepts
- Authorization Server
- Provides an API for authenticating users and obtaining an access token and a refresh token (Spring Oauth)
- Resource Owner
- The owner of the resources that needs to give authorization
- Resource Server
- A service that uses an access token to authorize access
- In a microservice architecture, the services are resource servers
- Client
- A client that wants to access a Resource Server
- In a microservice architecture, API Gateway is the OAuth 2.0 client
- Access Token
- A token that grants access to a Resource Server
- The format of the access token is implementation dependent
- Some implementations, such as Spring OAuth, use JWTs
- Refresh Token
- A long-lived yet revocable token that a Client uses to obtain a new Access Token
- Authorization Server
- Visual schema of a basic sequence flow for an Oauth2
- Use cases
- Implementing authentication and authorization correctly is challenging.
- Frameworks
- Passport (NodeJS)
- Spring Security
- ...