Implement support for additional load balancing algorithms

Question

Implement support for additional load balancing algorithms

hyperioxx opened this issue a year ago · comments

Currently, Frontman supports only the round-robin load balancing algorithm. However, there are several other algorithms that could be useful for certain use cases, such as least connections or IP hash. It would be great to add support for these additional algorithms to give users more flexibility in configuring their load balancing.

Proposed Solution: We could add an option to the configuration file for users to specify the load balancing algorithm they want to use. Then, we would need to implement the necessary logic in Frontman to support the selected algorithm. This could involve changes to the load balancing code, as well as updates to the API and documentation.

Additional Considerations: We should also consider how the new load balancing algorithms will impact Frontman's performance and scalability, as well as any potential compatibility issues with existing setups. We may need to conduct additional testing and optimization to ensure that Frontman remains stable and performant with the new features.

Amit Yahav · Answer 1 · Mon Mar 06 2023 23:09:39 GMT+0800 (China Standard Time)

Hey, i would like to tackle this. i will deep dive into the project in the next couple of days!

Aaron Parfitt · Answer 2 · Tue Mar 07 2023 00:12:14 GMT+0800 (China Standard Time)

@amityahav Absolutely dude! i've already "sketched" out the interface for a loadbalancer policy here https://github.com/Frontman-Labs/frontman/blob/main/loadbalancer/loadbalancer.go and a simple implementation of round robin here https://github.com/Frontman-Labs/frontman/blob/main/loadbalancer/roundrobin.go. Have a look and tell me your thoughts as the interface might have to change as we may need extra info for a weighted policy. Im thinking of using the strategy pattern for this ? but what are your thoughts ?

Amit Yahav · Answer 3 · Thu Mar 09 2023 03:14:27 GMT+0800 (China Standard Time)

Hey, did u test the gateway? cuz im getting weird behaviors and i cant make it working properly.
i've created a service with 3 instances.
i've encounterd a problem inside findBackendService when my s.domain=localhost but r.Host=127.0.0.1:8081 which is the proxy's ip addr. also i would expect that the r.URL.host would equal to the target addr.
also targetURL, err := url.Parse(backendService.Scheme + "://" + upstreamTarget + backendService.Path)
this seems to not parse the url correctly

Aaron Parfitt · Answer 4 · Thu Mar 09 2023 03:32:39 GMT+0800 (China Standard Time)

@amityahav Thank you dude. I haven't had the chance to test it extensively yet and tbh i need to put the time into fleshing out the tests, but I'll be happy to help you troubleshoot the issue but i won't be available till tomorrow. Your welcome to triage the problem and raise an issue if you want :)

Aaron Parfitt · Answer 5 · Thu Mar 09 2023 09:05:23 GMT+0800 (China Standard Time)

@amityahav can you rebase i think i've solved your issue :)

Amit Yahav · Answer 6 · Thu Mar 09 2023 16:16:38 GMT+0800 (China Standard Time)

nope, problem still occurs. i will investigate later on

Aaron Parfitt · Answer 7 · Thu Mar 09 2023 16:40:32 GMT+0800 (China Standard Time)

Hmmmm that's annoying, I started adding more tests last night to the gateway handler. I would be good to get the scenario as a test case for regression. Also thank you dude !

Aaron Parfitt · Answer 8 · Thu Mar 09 2023 16:51:53 GMT+0800 (China Standard Time)

@amityahav i've created a test case here what i think your problem is please confirm

{
			name:               "Test Case 8 - Multiple backend targets with localhost domain",
			domain:             "localhost",
			path:               "/api",
			scheme:             "http",
			stripPath:          true,
			maxIdleConns:       100,
			maxIdleTime:        10,
			timeout:            5,
			upstreamTargets:    []string{"http://localhost:8000", "http://localhost:8001", "http://localhost:8002"},
			requestURL:         "http://localhost/api/anything?test",
			expectedStatusCode: http.StatusOK,
			expectedHeader:     "plugin",
},

Amit Yahav · Answer 9 · Thu Mar 09 2023 17:23:38 GMT+0800 (China Standard Time)

my services.yaml looks like this :

name: test_service
scheme: http
upstreamTargets:
- http://localhost:5000
- http://localhost:5001
- http://localhost:5002
  path: /api
  domain: localhost
  healthCheck: /api
  retryAttempts: 3
  timeout: 10ns
  maxIdleConns: 100
  maxIdleTime: 30ns
  stripPath: false

and im initiating a /GET request to http://127.0.0.1:8081/api/

Aaron Parfitt · Answer 10 · Thu Mar 09 2023 18:06:36 GMT+0800 (China Standard Time)

@amityahav Ah Ok so there's a few things upstream targets shouldn't have scheme but i'm now thinking that it should be (I would like your thoughts on that) and also there's a bug in the creation of client's which i'm doing a hotfix for

Aaron Parfitt · Answer 11 · Thu Mar 09 2023 18:07:24 GMT+0800 (China Standard Time)

@simonchapman1986 @nonsoike What do you think ?

Simon Chapman · Answer 12 · Thu Mar 09 2023 18:29:35 GMT+0800 (China Standard Time)

so we have the scheme defined outside for the service, but, we don't on the target. I think scheme would be useful, as could lead to being able to transpose schemes at a later date should it be desired. Also, from a security standpoint probably good to be explicit @hyperioxx @nonsoike

nonsoike · Answer 13 · Thu Mar 09 2023 18:44:02 GMT+0800 (China Standard Time)

@hyperioxx, it sounds good to me.
I like that the interface for the load balancing is well-defined.

In summary, my understanding is that we will support multiple load-balancing options using the following steps:

Support a set of load balancing options: round-robin, least connections, ..., etc
Give users the ability to select any of the supported load balancing options.
Use round-robin as a default.

Aaron Parfitt · Answer 14 · Thu Mar 09 2023 19:04:53 GMT+0800 (China Standard Time)

@amityahav rebase and let me know :)

Amit Yahav · Answer 15 · Fri Mar 10 2023 07:19:01 GMT+0800 (China Standard Time)

problem still occurs. it fails in findBackendService method because r.Host != s.Domain (s.Domain = localhost, r.Host = what ever i put in my request in my case is 127.0.0.1:8081). I mainly struggle to understand the logic behind this function.
r.Host is the proxy's host while we are looking for the target host. what is the purpose of domain in that case when using raw ip addresses?
In addition currently i cant see how this support multiple services, i mean if two distinct services have the same path, the first one in the services slice will get the request.

I came across a nice blogpost where they implement a multiple hosts reverse proxy. in their case they distinguish among hosts by service's name as the prefix in the path of the request to the proxy i.e /service_name/foo and then extracting the suffix as the path to the target host.
https://blog.charmes.net/post/reverse-proxy-go/

Let me know what you think :)

Aaron Parfitt · Answer 16 · Fri Mar 10 2023 08:07:43 GMT+0800 (China Standard Time)

The domain value in our current implementation is used for host-based routing, which is a common feature in load balancers and gateways.
Example:

client1.foo.com/myservice/ -> myserviceV0.1:8000
client2.foo.com/myservice/ -> myserviceV0.2:8000

However, this implementation has limitations and may not be optimal for our needs. Using the service name as a prefix in the incoming request can be a simple way to implement service name routing, but it has its own limitations, such as namespace conflicts and coupling of the reverse proxy to the service implementation.

Fortunately, there are other techniques we can use for routing, such as a trie-based approach. This allows us to efficiently match incoming requests to the appropriate backend service based on their path, without relying on the service name as a prefix. By decoupling the reverse proxy from the service implementation, we can make our system more flexible and easier to maintain in the long run in my opinion.

https://en.wikipedia.org/wiki/Trie

Let me know your thoughts dude !

Amit Yahav · Answer 17 · Fri Mar 10 2023 15:19:40 GMT+0800 (China Standard Time)

So in the example you wrote what would be the correct service configuration in order to make it work? And will it support multiple hosts as well?

Aaron Parfitt · Answer 18 · Fri Mar 10 2023 18:08:28 GMT+0800 (China Standard Time)

How most services do it you would create a new backend per domain. (i would love to also implement wild card domains too *.foo.com)

- name: Client1
  scheme: https
  upstreamTargets:
    - http://myserviceV0.1:8000
  path: /myservice
  domain: "client1.foo.com"
  healthCheck: http://example.com/health
  timeout: 10ns
  maxIdleConns: 100
  maxIdleTime: 60ns
  stripPath: true
- name: Client2
  scheme: https
  upstreamTargets:
    - http://myserviceV0.2:8000
  path:  /myservice
  domain: "client2.foo.com"
  healthCheck: http://example.com/health
  timeout: 10ns
  maxIdleConns: 100
  maxIdleTime: 60ns
  stripPath: true

Simon Chapman · Answer 19 · Fri Mar 10 2023 20:25:40 GMT+0800 (China Standard Time)

agreed @hyperioxx it is often required for domain/namespace based routing like this where Client1 is attached to n backend targets. I would also like to point out the obvious, that whilst addressing confirming this as a feature, that the other more common trend is the reverse side of the scale where services operate the ownership of the backend

- name: Service1
  scheme: https
  upstreamTargets:
    - http://myserviceV0.1:8000
  path: /myservice
  domain: "api.foo.com"
  healthCheck: http://example.com/health
  timeout: 10ns
  maxIdleConns: 100
  maxIdleTime: 60ns
  stripPath: true
- name: Service2
  scheme: https
  upstreamTargets:
    - http://myservice2V0.2:8000
  path:  /myservice2
  domain: "api.foo.com"
  healthCheck: http://example.com/health
  timeout: 10ns
  maxIdleConns: 100
  maxIdleTime: 60ns
  stripPath: true

I do however wonder, if it might be cleaner to have effectively a list? I just wonder if it might be cleaner to structure the yaml whereby the domain is the upper most parent (just an idea) - this would also help things like wildcard domains, and working out the path based routing easier when looking at trie - something like:

- domain: "api.foo.com"
   name: MySaaSAPI
   scheme: https
   backends:
      - name: Service1
         upstreamTargets:
            - http://myservice-v0.1:8000
         path: /myservice
         healthCheck: http://example.com/health
         timeout: 10ns
         maxIdleConns: 100
         maxIdleTime: 60ns
         stripPath: true
      - name: Service2
         upstreamTargets:
            - http://myservice2-v0.1:8000
         path: /myservice2
         healthCheck: http://example.com/health
         timeout: 10ns
         maxIdleConns: 100
         maxIdleTime: 60ns
         stripPath: true
- domain: "bar.foo.com"
   name: OtherAPI
   scheme: https
   backends:
      - name: Service3
         upstreamTargets:
            - http://myservice3-v0.1:8000
         path: /myservice
         healthCheck: http://example.com/health
         timeout: 10ns
         maxIdleConns: 100
         maxIdleTime: 60ns
         stripPath: true
- domain: "*.foo.com"
   name: Global
   scheme: https
   backend:
      - name: Contact
         upstreamTargets:
            - https://contactus-v0.1:8000
         path: /contact-us
         healthCheck: http://example.com/health
         timeout: 10ns
         maxIdleConns: 100
         maxIdleTime: 60ns
         stripPath: true

this would give us:

api.foo.com/myservice
api.foo.com/myservice2
bar.foo.com/myservice
api.foo.com/contact-us
bar.foo.com/contact-us

thinking beyond just API's here, have to consider frontend webapps such as SSR react/vue apps, and being able to use constructive techniques to simplify and keep that clean, whilst also making it easy to see/understand, would be quite impressive. Just an idea though.. happy to be baraged with abuse 🤣

thoughts? :)

Amit Yahav · Answer 20 · Sat Mar 11 2023 04:21:40 GMT+0800 (China Standard Time)

I have no concrete opinion regarding what you wrote since I'm new to proxies but it seems more cleaner that way

Btw since I'm a newbie is this correct according to your example?

api.foo.com/myservice will be routed to myservice-v0.1:8000

Thanks @simonchapman1986

Aaron Parfitt · Answer 21 · Tue Mar 14 2023 17:11:51 GMT+0800 (China Standard Time)

awesome work! @amityahav thank you !

Amit Yahav · Answer 22 · Tue Mar 14 2023 17:22:53 GMT+0800 (China Standard Time)

Thanks, i wouldnt close the issue just yet as i want to implement other algorithms as wel