feup-cosn

Summary of the contents lectured in 'Cloud and Service Oriented Computing', a course from the Master in Software Engineering @FEUP.

Index

Introduction
- Basic Concepts
- Evolution of cloud platforms
Designing an application with a microservice architecture - Part I
Designing an application with a microservice architecture - Part II
Inter-Service Communication - Part I
- Synchronous communication
Inter-Service Communication - Part II
- Asynchronous communication
Software Architecture Patterns
Integrating Services - Part I
- API Gateway
- Resiliency
Integrating Services - Part II
- Observability
- Security

Introduction

Basic Concepts

Computing and storage as a service
- Computing and storage resources providing an application platform as a service
  - Utility Pricing
  - Elastic Resource capability
  - Virtualized Resources
  - Management Automation
  - Self-service provisioning
  - ...
Different types of services can be offered, namely:
- Infrastructure as a Service (IaaS)
  - Offer computing infrastructures (e.g virtual machines)
  - High-level APIs used to dereference various low-level details of underlying network infrastructure
  - Hypervisor (Virtual Machine Monitor) is responsible for loading virtual machines
  - Other examples of IAAS: disk-image library, firewalls, loadbalancers, VLANs, software bundles
- Platform as a Service (PaaS)
  - Platform allowing customers to develop, run, and manage applications
  - Discards complexity of building and maintaining the infrastructure
  - Software deployment controlled with minimal configuration options
  - Provider provides the networks, servers, storage, operating system (OS), middleware (e.g. Java runtime, .NET runtime), database and other services to host the consumer's application.
- Software as a Service (SaaS) aka "on-demand software"
  - Access to application software and databases over the Internet
  - Providers manage the infrastructure and platforms that run the applications
  - Usually priced on a pay-per-use basis or using a subscription fee
- Function as a Service (FaaS)
  - Platform allowing customers to develop, run, and manage applications
  - Complete abstraction of servers away from the developer
  - FaaS vs PaaS
    - PaaS: deploy an entire application
    - FaaS: deploy what is essentially a single function, or part of an application
    - FaaS: designed to potentially be a serverless architecture
Serveless Computing
- Provider dynamically manages the allocation of machine resources
- Pricing based on the actual amount of resources consumed by an application
- Server management and capacity planning decisions are completely hidden from the developer
- Can be used in conjunction with code deployed in traditional styles, such as micro-services
Types of cloud
Cloud Features
- Elasticity
  - Self-managing system. Users only inputs the desired policies
  - Provides agility and adaptability to environment changes
  - Implies horizontal and vertical scalabilities
    - Horizontal: adding more machines/ resources
    - Vertical: adding more power (e.g.: CPU, RAM) to existing machines
- Reliability and Availability
  - Ensures constant operation through redundant resource usage (e.g.: fault tolerance)
  - Loag-balancing -> Ability to deal with increasing concurrent access
- Quality of Service
  - Services meet users requirements (e.g.: response time)
- Pay per use
  - Services sold as Utility Computing
  - Costs according to actual resource consumption
- Going Green
  - Reduce energy consumption -> Reduce costs & carbon footprint
Virtualization is essential in the Cloud
- Provides all the cloud features (e.g.: ease of use, flexibility and adaptability, location independence, etc.)

Summarizing in an image:

Evolution of cloud platforms

Serverless (or Functions as a Service (FaaS)) is the culmination of several iterations
The evolution began with physical metal in the data center and progressed through Infrastructure as a Service (IaaS) and Platform as a Service (PaaS).
Before the cloud, to deploy one had to answer:
- What hardware should be installed?
- How is the physical access to the machine secured?
- Where are storage backups sent?
- ...

1. IaaS

Still requires heavy overhead because staff are still responsible for various tasks
- Patching and backing up servers;
- Installing packages;
- Keeping the operating system up-to-date;
- Monitoring the application.

2. PaaS

reduces the overhead
- cloud provider handles operating systems, security patches, and even the required packages to support a specific platform
- Instead of building VM, developers now user "platform targets"
Questions are reduced to:
- What size services are needed?
- How do the services scale horizontally?
- And vertically?

3. Serverless

Abstracts servers by focusing on event- driven code.
Developers focus on a microservice that does one thing (instead of platform)
Questions are:
- What triggers the code?
- What does the code do?
Billing
- IaaS and PaaS
  - Pay to host the endpoints even when they aren't being accessed
- Serverless
  - micro-billing
    - scale each endpoint independently
    - pay for usage
    - no costs are incurred when the APIs aren't being called

Designing an application with a microservice architecture - Part I

Key idea
- Application as a set of services instead of one large application
- A service is a standalone, independently deployable software component that implements some useful functionality
- Each service is deployed separately and they communicate through well-defined network-based interfaces
Hexagonal architecture style
- Alternative to the layered architectural style (UI Logic -> Business Logic -> Data Access Layer)
- Puts the business logic at the center
- Instead of the UI layer, the application has one or more inbound adapters that handle requests from outside and invoke the business logic (center of the hexagon)
- Business logic independent of the adapters
- Decoples business logic from UI and data acess logic in the adapters
Desinigning with microservice architecture
- Identify system operations -> Identify services -> Define APIs and collaborations
- Identify system operations
  - Identify the application's requirements (aka User Stories and associated user scenarios) -> macro-architecture
  - A requirement / external request will map to a system operation
  - A system operation is an abstraction of a request that the application must handle
    - Can be a command -> update data (create, update, delete)
      - specified by the parameters, return value and behaviour
      - Behaviour:
        
        Specifies the preconditions that must be true before invoke the operation
        
        Specifies post-conditions that are true after invoking the operation
    - Can be a query -> retrivies data (get)
  - Two steps:
    - Create a high-level domain model (identify key classes)
      - nouns of the user stories
      - simpler than the fina implementation
      - The application won’t even have a single domain model because each service has its own domain model
      - Useful for defining vocabulary for describing behaviour / system operations
    - Describe operations using those classes
      - verbs of the user stories

Designing an application with a microservice architecture - Part II

Decomposition

Decomposition strategies
- by business capability pattern
  - Organizes services around business capabilities
- by subdomain pattern
  - Organizes services around domain-driven design (DDD) subdomains
Decompose by business capability
- A business capability is something that a business does in order to generate value.
- Business capabilities define what an organization does independently of how it does.
  - Capabilities are stable over time
  - How capabilities are executed may change over time
  - e.g.: deposit that previosuly was in checks now is ATM, bu it is still a deposit
- Based on:
  - Organization purpose
  - Structure
  - Business processes
- In our project we followed this decomposition
- From business capabilities to services
  - A capability -> A Service, AND/ OR
  - A group of capabilities -> A Service
  - The process is subjective
    - Dependends on granularity and similarity among domain objects
  - Iterative process
    - As we learn more about the application domain, we may change the architecture
  - e.g.:
Decompose by subdomain pattern
- DDD - domain driven design
  - Aligned with the internal company organization
  - Domain is related to the business
  - One department/sub-domain -> One Service
- DDD concepts
  - Subdomains
    - A department can have multiple sections, and each section may be a sub-domain
  - Bounded contexts (scope)
    - Explicit boundary within which a domain model exists
  - DDD vs global business modelling
    - No single model for the entire business
    - The domain model is private to the sub-domain and other sub-domains do not have to agree with the model developed.
Notes
- A good design will scope out one microservice to a single bounded context
- The SOA (Service Oriented Architecture) approach would model the enterprise as a whole
- Communication between microservices can happen via events:
  - Events are triggered as a result of state changes in bounded contexts
  - Like we did in the project
Decomposition guidelines
- Single Responsibility Principle
  - A class/ service should have only one reason to change
  - If a class/ service has multiple responsibilities that change independently, it won’t be stable
- Common Closure Principle
  - The classes in a package should be closed together against the same kinds of changes. A change that affects a package affects all the classes in the package
  - Two classes change in lockstep -> same package
  - Improves the maintainability of application
Decomposition obstacles
- Network latency
  - Too many messages between two services
- Reduced availability due to synchronous communication
  - REST is synchronous and may become a bottleneck
- Maintaining data consistency across services
  - Some system operations need to update data in multiple services.
  - atomic updates -> data reside within a single service
- Obtaining a consistent view of the data
  - Although each service’s database is consistent, we can’t obtain a globally consistent view of the data.
  - Consistent view needed -> must reside in a seginle service
- God classes preventing decomposition:
  - Classes that are used throughout an application
  - Tipically implements business logic for many different aspects of the application
  - Normally has a large number of fields mapped to a database table with many columns
  - Central concept in the application domain
  - DDD solution
    - Treat each service as a separate sub-domain with its own domain model (i.e.: a version of the God class).
    - e.g.: an Order for the kitchen is a ticket, for the delivery an adress, ...

Service API

Operations
- A system operation
- Collaborative operation between services
Events
- State changes
- Notifications
Step 1
- Assigning System Operations to Services
  - Which service is the initial entry point for a request?
  - Assign an operation to a service that needs the information provided by the operation
  - Assign an operation to the service that has the information required to handle it
Step 2
- Defining the APIs required to support collaboration between services
  - Which services will I need to collab with?
Context map
- Help visualize the relationships between bounded contexts.
- The upstream context defines the domain model passed between the two contexts.
- The downstream context should be well aware of any changes happening to the upstream context.
- e.g.:
Notes
- Each service should represent a Business Logic (Is it not this decomposition by BC? Confirm with them)
- A data service is probably a bad design

Microservices Design Principles

High Cohesion and Loose Coupling
- AVOID: one microservice to address two or more unrelated problems
- Highly cohesive system is naturally loosely coupled. Coupling is a measure of the interdependence between different microservices
Resilience
- Measure of the capacity of a system / individual components in a system to recover quickly from a failure
- Doesn’t cause the entire system to fail
- Implementation:
  - Timeouts for calls over the network
  - Circuit Breaker
    - microservice keeps timing out against one endpoint all the time -> no point keep trying, at least for some time -> wrapper circuit breaker does this
    - automatic error responses for services exceeding a failure threshold in the recent past
Observability
- combination of monitoring, distributed logging, distributed tracing, and visualization of a service’s runtime behavior and dependencies
- May track throughput of each microservice, the number of success/failed requests, utilization of CPU, memory, latency and some business-related metrics
- How to achieve?
  - Logging
    - record events
  - Metrics
    - latency, ... obtained by processing the logged events
  - Tracing
    - consider the event logs ordering
    - Allows to trace a problem (e.g. high latency)
Automation
- rational behind a microservice architecture -> less time to production and shorter feedback cycles
- Two categories
  - Continuous integration
    - Focus on maintaining maintain source code integrity
  - Continuous deployment
    - bundle applications, infrastructure, middleware, and the supporting installation processes and dependencies into release packages

Inter-Service Communication - Part I

Services are autonomaus
Services communicate over the network
A service based application can be considered a distributed system running multiple services on different network locations.
Communication types:
- Synchronous
  - The client sends a request and waits for response from the service. Both parties have to keep the connection open until response arrives.
  - Can be a non-blocking IO implementation, using callbacks
- Asynchronous
  - Send message and proceed without waiting for response

Synchronous communication

REST (Representational State Transfer)
- Uses a navigational scheme to represent objects and services over a network, known as resources.
- Not protocol dependent
- With the HTTP protocol, a resource is accessed using a unique URL and the standard HTTP operations GET, PUT, DELETE, POST, and HEAD
- Stateless servers.
RPC (Remote Procedure Calls)
- Key objective
  - Make the process of executing code on a remote machine as simple and straightforward as calling a local function
- Lost popularity due to complexity and performance
- gRPC
  - Developed by Google -> now Open Source
  - Uses Protocol Buffers: a language-neutral, platform-neutral extensible mechanism for serializing structured data
  - Interface Definition Language (IDL) describe both the service interface and the structure of the payload messages
  - USes server-side skeletions and client-side stubs to invoke the service
  - Uses HTTP2 as the transport protocol -> key reason for the success and wide adaptation of gRPC
    - Advantages
      - Multiple requests in the same open connection
      - Lower overhead due to less redundancy over several requests (e.g. Cookies)
      - Avoids header repetition -> introduces header compression to optimize bandwidth use
REST - Richardson Maturity Model
- Level 0
  - Not considered RESTful at all
  - Single URL for all resources and the content decides the operation
  - Single HTTP method (in most cases, POST)
  - SOAP web services are of this kind
- Level 1 - Resource URIs
  - Has individual URIs for each resource, but
  - The message still contains operation details
- Level 2 - HTTP verbs
  - HTTP verbs to specifiy operation
  - RESTful service consider to be a proper REST API
- Level 3 - Hypermedia Controls
  - Service responses have links that control the application state for the client (Hypertext as The Engine of Application State
  - Hypermedia controls tell us what we can do next, and the URI of the resource we need to manipulate to do it
REST - Message formats
- JSON
- GraphQL
  - Improves the REST model by allowing to retrieve multiple data in a single call
  - JSON format
  - Netflix Falcor provides similar function

Inter-Service Communication - Part II

Asynchronous communication

Allows the services to be more autonomous
The client does not wait for a response
The client may note receive a response at all or the response will be received asynchronously via a different channel
Middleware: lightweight and dumb message broker
- There is no business logic in the broker and it is a centralized entity with high-availability
Message Protocols
- JSON
- XML
- Apache AVRO
  - Compact, fast, binary data format
Messaging styles
- Single receiver
  - A given message is reliably delivered from a producer to exactly one consumer through a message broker
- Multiple receivers
  - Message produced by a single producer is delivered to more than one consumer
  - Publisher-subscriber pattern (pub-sub)
  - AMQP-based brokers support pub-sub messaging, e.g. RabbitMQ
  - Kafka is the most widely used broker for pub-sub messaging between microservices
AMQP - Advanced Message Queuing Protocol
- Protocol for interoperability between all messaging middleware
- Ensures reliability of message delivery, fast and message acknowledgements
  - When a message is delivered to a consumer, the consumer notifies the broker
  - The broker will only completely remove a message from a queue when it receives a notification for that message
  - The queue ensures the ordered delivery and processing of the messages
- AMQP Message Brokers: Software that implements the protocol (RabbitMQ, ActiveMQ, ...)
- e.g.
- Features
  - heartbeat / healtcheck
    - To ensure that the application layer promptly finds out about disrupted connections and completely unresponsive peers
  - Broker failures
    - AMQP standard defines a concept of durability for exchanges, queues, and persistent messages, requiring that a durable object or persistent message survive a restart
  - Producer failures
    - Retransmit any messages for which an acknowledgement has not been received from the broker
    - Possibility of message duplication: consumer applications need to be implement in a way that internal state doesn’t change even if the same message is processed multiple times.
- RabbitMQ
  - General purpose message broker, employing several variations of point to point, request/reply and pub-sub communication styles patterns.
  - Smart broker / dumb consumer model
  - e.g
Kafka
- Distributed pub-sub messaging system, designed for high volume messages
- Has its own messaging protocol
- Data is stored durably, in order and can be read deterministically
- Data is distributed for failover and scalability
- Unit of data is a message (key, value, timestamp)
  - Value is an array of bytes
  - Messages are organized in Topics
    - Topics may be split into multiple partitions
      - A producer can select a partition by using a key
- Partitions are the primary mechanism in Kafka for parallelizing consumption and scaling a topic beyond the throughput limits of a single node
- Each partition can be hosted in different nodes
- No need to specify a partition to write, by default
- Dumb broker and assumes smart consumers to read its buffers
- Use cases:
  - Distribute a message to multiple receivers
  - Similar receivers consume messages, from the same topic, for scaling the consumer side

Software Architecture Patterns

Architecure Evolution
- Monolith
- SOA (2000s)
- Microservices (2010s)

Monolith Architecture

Contained in a single deployment
Everything, from user interface to database calls, is included in the same codebase
Good for relatively small applications
Advantages
- Easier to pull down a single code base and start working
- Ramp up time may be less
- Creating test environments is as simple as providing a new copy
- It may be designed to include multiple components and applications
Disadvantages
- Difficult to work in parallel in the same code base
- Any change, no matter how trivial, requires deploying a new version of the entire application
- Refactoring potentially impacts the entire application -> tight coupling
- Often the only solution to scale is to create multiple, resource-intensive copies of the monolith
- Integration can be difficult
- Difficulty to test due to the need to configure the entire monolith
- Code reuse is challenging and often other apps end up having their own copies of code
- Hard to apply agile development
- Single point of failure
- Difficult to adopt new technologies and frameworks, as all the functionalities have to build on homogeneous technologies/ frameworks
N-Layer applications
- Partition application logic into specific layers
- Most common layers
  - UI Layer
  - Business Logic Layer
  - Data access Layers
- Advantages
  - Refactoring is isolated to a layer
  - Teams can independently build, test, deploy, and maintain separate layers
  - Layers can be swapped out
Visual Schema

Service Oriented Architecture (SOA)

Applications are composed of more loosely coupled components that use a messaging bus to communicate between themselves
Services -> reusable, loosely coupled entities
- Self-contained implementation of a well-defined business functionality
- Acessible via calls over the network
- Software components with well-defined interfaces that are implementation-independent. Separation of the interface (the what) from its implementation (the how).
- Consumers are only concerned about the service interface and do not care about its implementation.
- Composite services can be built from aggregates of other services.
- Deployed inside an application server
Requires an additional layer - Enterprise Service Bus (ESB)
- Integrates business capabilities (product, customer, ...) -> creates composite business capabilities, exposed to the consumers
- Contains a significant portion of the business logic of the entire application
- Monolithic entity where all developers share the same runtime to develop/deploy their service integrations.
- Smart Pipes
API Gateway
- Difficulty in interact with SOAP, which leads to
- Layer on top of the existing SOA implementations
- Known as the API façade
- Exposes a simple APIfor a given business functionality and hides all the internal complexities of the ESB/Web Services layer
- Also used for security, throttling, caching and monetization
Visual Schema

Microservices Architecture

Independent application services delivering one single functionality in a loosely connected and self-contained fashion, communicating through a light-weight protocol (e.g. HTTP, REST, ...)
More details in previous chapters
Visual schema
Characteristics
- Business Capability Oriented
  - Service a specific business purpose a well-defined set of responsibilities
  - Each service does only one thing and does it well
  - SOA weill have more generic services, while here we have fine-grained services
- Autonomous: Develop, Deploy, and Scale Independently
  - Microservices are developed, deployed, and scaled as independent entities
  - Services do not share the same execution time
  - Increases system resilience due to isolation of failures to service level;
  - Can scale microservices according to each microservice traffic
- Elimination of the central ESB by breaking its functionalities into each service
- Services take care of the inter-service communication and composition logic
  - Using smart endpoints and (dumb pipes or lightweight protocols like REST)
    - Smart endpoints
      - All business logic resides at micro service level
    - Dumb pipes
      - Only route messages
      - Zero business logic
- Failure tolerance
  - One microsrevice crashes -> only it collapses
  - Need to apply all the resiliency-related capabilities, such as circuit breakers, disaster recovery, load- balancing, fail-over, and dynamic scaling based on traffic patterns
- Decentralized data management
  - Each microservice has its own database
  - Several databases might need to be updated as a consequence of a single API request
  - Microservices update each other using the APIs
  - A microservice can only access its own database
- Service Governance
  - Decentralizede process -> each entity govens its own domain
    - In -soa this concepts are discarded
  - Design-time governance of services
    - Technologies, protocols, ...
  - Runtime governance
    - Service definitions, service registry and discovery, service versioning, service runtime dependencies, service ownerships and consumers, enforcing QoS, and service observerability
- Service Observerability
  - combination of monitoring, distributed logging, distributed tracing, and visualization of a service’s runtime behavior and dependencies (as seen in previous chapters)

Integrating Services - Part I

In a monolith approach, a request may be satisfied with a single call
With microservices, we may need to access multiple services to satisfy the same request
- External access has higher latency
Invoking the services directly (e.g. web service calling microservices) have the following problems:
- Multiple requests to retrieve the data needed
  - inefficient
  - can result in a poor user experience;
- The lack of encapsulation
  - caused by clients knowing about each service and its API
  - difficult to change the architecture and the APIs
  - developers might change an API in a way that breaks existing clients
  - Updating client’s app is more cumbersome
- Services might use communication mechanisms that aren’t convenient or practical for clients to use

API Gateway

Entry point service into the microservices-based application from external API clients
Encapsulates the application’s internal architecture and provides an API to its clients
May also have other responsibilities, such as authentication and monitoring
Visual impact
Responsible for
- Request routing (to a service)
- API composition (invokes multiple services)
- Protocol translation
  - Client protocol may be different from the services (e.g. REST and RPC)
May have 3 API modules
- Mobile API
  - API for the mobile client
- Browser API
  - API for the Js app runinng on the browser
- Public API
  - API for third party developers
Why provide each client with its own API?
- e.g. -> mobile apps present less information than browsers
- Higher reliability
- Independently scalable
Architecture Disadvantages
- yet another highly available component that must be developed, deployed, and managed
- risk that the API gateway becomes a development bottleneck
  - Developers must update the API gateway to expose their services’s API
  - The process for updating the API gateway must be as lightweight as possible
Design issues
- Performance and scalability
  - Every external request goes through the gateway
  - Synchronous I/O model
    - Each connection is handle by a dedicated thread
    - OS threads are heavyweight
    - Limit on the number of threads
  - Asynchronous (nonblocking) I/O model
    - Single thread dispatches I/O requests to event handlers
    - More scalable but more complex programming
- Writing maintainable code by using reactive programming abstractions
  - Calling services sequentially
    - The response time is the sum of all services response times
    - If there is no dependencies among services, all services should be called concurrently
      - Solution: REACTIVE APPROACH
        
        Java 8 CompletableFutures
        
        RxJava (Reactive Extensions for Java) Observables, created by Netflix specifically to solve this problem
        
        Scala Futures
        
        JavaScript promises, RxJS (for NodeJS)
- Handling partial failure
  - Properly handle failed requests and requests that have unacceptably high latency.
    - Circuit breaker pattern when invoking services

Resiliency

measure of the capacity of a system or individual components in a system to recover quickly from a failure
attribute of a system that enables it to deal with failure in a way that doesn’t cause the entire system to fail (other definition)
microservices architecture is naturally a distributed system
- collection of computes (or nodes) connected over the network, with no shared memory, that appears to its users as a single coherent system
- Network will always be unreliable
Patterns
- Timeout
  - deciding when to stop waiting for a response at the caller service level
  - After that time some action is taken
  - Isolate failures
    - Isolate failures
      - Service A (integrator) calls services X and Y
        
        A timeout should be defined separately for X and Y
      - A failure of another service does not have to become your service’s problem
- Circuit Breaker
  - Wrap the invocation with an object that monitors and prevents further damage to the system
  - If the service invocation fails repeatedly and reaches a certain threshold, then the circuit breaker wrapper prevents any further invocation by the external service
  - After a certain period of time (circuit reset timeout), a new request is allowed (Half Open state). If it succeeds the breaker goes to the closed state (allows requests), and to the open state (does not allow requests) otherwise
  - Prevents cascade failures
  - Visual schema
- Fail fast
  - detect a failure as quick as possible
  - Concept
    - a failure response is much better than a slow failure response
- Bulkhead
  - Application is partitioned so that an error that occurs in a partition is localized to that one partition only, avoiding leading the entire system to a fail state
  - Procedure
    - Group similar operations into a microservice
    - Independent business functionalities are implemented as separate microservices
  - Following the microservices architecture ensures this pattern
- Load Balancing
  - Distribute the load across multiple microservice instances
  - Kubernetes (e.g.) provides this capability
- Failover
  - Reroute requests to alternate services if a given service fails
  - Kubernetes (e.g.) provides this capability
- Let it crash
  - If service is unstable and recovery difficult -> start new server instance
  - Having a service per host and a rapid server startup time is crucial (container based)

Integrating Services - Part II

Observability

Includes
- Monitoring
- Logging
- Tracing
- Visualization
Aim
- be alerted if there is a problem, such as a failed service instance or a disk filling up
- ideally before it impacts a user
- in case of error
  - Be able to troubleshoot and identify the problem's source
Patterns
- Health check API
  - A service exposes a health check API endpoint, such as GET /health, which returns the health of the service
    - Verifies database access and communication server
  - Invoked periodically to determine the health of the service.
  - Visual schema
- Log aggregation
  - Aggregate the logs of all services in a centralized database that supports searching and alerting
    - Aggregate logs
    - Store them
    - Allow user searches
  - When a request involves more than one service, the log aggregation allows to have all log information, related to that query, centralized
  - Alerts can be configured to be triggered when log entries match a given search criteria
  - Popular infrastructures
    - ElasticSearch
    - Logstash
    - AWS CloudWatch Logs
    - ...
- Distributed tracing pattern
  - Assign each external request a unique ID and record how it flows through the system from one service to the next in a centralized server that provides visualization and analysis
    - Makes uses of the logs
  - A trace represents an external request and consists of one or more spans.
  - A span represents an operation
    - key attributes
      - operation name
      - start timestamp
      - end time
    - Can have one or more child spans
      - represent nested operations
- Application metrics
  - Services report metrics to a central server that provides aggregation, visualization, and alerting
    - Metrics to provide critical information about the health of an application
      - Infrastructure-level: CPU, memory, disk utilization
      - Application-level: service request latency and number of requests executed
  - Push model
    - A service instance sends the metrics to the Metrics Service by invoking an API
      - e.g. AWS Cloudwatch metrics
  - Pull model
    - Metrics Service invokes a service API to retrieve the metrics from the service instance
      - e.g. Promotheus
- Exception tracking
  - Services report exceptions to a central service
    - de-duplicates exceptions
    - generates alerts
    - manages the resolution of exceptions
  - Exception might be a symptom of a failure or a programming bug
  - Service rarely log an exception
    - when it does, it is important to identify the root cause.
- Audit logging
  - Record user actions in a database
    - Objectives
      - help customer support
      - ensure compliance
      - detect suspicious behavior
  - Each audit log entry records:
    - identity of the user
    - action they performed
    - the business object(s)

Security

Includes
- Authentication
  - Verifies the identity
    - typically verifies credentials, such as a user ID and password
- Authorization
  - Verify if allowed to perform the requested operation on the specified data
  - Applications often use a combination of role-based security and access control lists (ACLs).
- Auditing (Repeated from previous chapter)
  - Tracking operations performed
    - Objectives
      - help customer support
      - ensure compliance
      - detect suspicious behavior
- Secure interprocess communication
  - Ideally,all communication in and out of services should be over Transport Layer Security (TLS)
  - Interservice communication may need authentication
Session Token
- Stores ID and roles of the user
Security context
- Established by the system based on the Session Token
- Request handlers retrieve user information from the security context
  - Client has to send session token on all requests
Monolithic vs Microservices
- Monolithic
  - In-memory security context
    - Share context among several requests
  - Centralized session
    - Store session information in a database
- Microservices
  - Services do not share memory
  - Central database not compliant with microservices
Handling authentication in the API gateway
- Avoids authentication by all services
- Centralizes the security issues in a single point
Handling Authorization
- Should be handled by the services
  - Otherwise the API Gateway becomes to coupled
  - API Gateway can only apply role-base authorization to URL paths
Token Types
- UUID - universally unique identifier
  - requires an asynchronous RPC call by the service to a security server in order to retrieve user information
- JWT - Json Web Token
  - Transparent token
  - Contains the user information, ID and Roles, and expiration date
  - signed by the creator (API Gateway) and decoded by the recipient (service) using a pair of keys
OAuth 2.0 standard protocol
- Authorization protocol that was originally designed to enable a user of a public cloud service, such as GitHub or Google, to grant a third-party application access to its information without revealing its password
- Can be used for authentication and authorization in a microservice application
- Uses HTTPS for communication between the client and the authorization server
- Key concepts
  - Authorization Server
    - Provides an API for authenticating users and obtaining an access token and a refresh token (Spring Oauth)
  - Resource Owner
    - The owner of the resources that needs to give authorization
  - Resource Server
    - A service that uses an access token to authorize access
    - In a microservice architecture, the services are resource servers
  - Client
    - A client that wants to access a Resource Server
    - In a microservice architecture, API Gateway is the OAuth 2.0 client
  - Access Token
    - A token that grants access to a Resource Server
    - The format of the access token is implementation dependent
    - Some implementations, such as Spring OAuth, use JWTs
  - Refresh Token
    - A long-lived yet revocable token that a Client uses to obtain a new Access Token
- Visual schema of a basic sequence flow for an Oauth2
- Use cases
  - API client
  - session-oriented clients
- Implementing authentication and authorization correctly is challenging.
- Frameworks
  - Passport (NodeJS)
  - Spring Security
  - ...

EdgarACarneiro / feup-cosn

feup-cosn

Index

Introduction

Basic Concepts

Evolution of cloud platforms

Designing an application with a microservice architecture - Part I

Designing an application with a microservice architecture - Part II

Decomposition

Service API

Microservices Design Principles

Inter-Service Communication - Part I

Synchronous communication

Inter-Service Communication - Part II

Asynchronous communication

Software Architecture Patterns

Monolith Architecture

Service Oriented Architecture (SOA)

Microservices Architecture

Integrating Services - Part I

API Gateway

Resiliency

Integrating Services - Part II

Observability

Security

About