Broker field `spec.delivery.retry` is ignored
norbjd opened this issue · comments
Describe the bug
When configuring a broker with fields under spec.delivery
(retry
, backoffPolicy
, backoffDelay
, ...), those fields seems to be ignored. There is no retries if the Trigger
object fails to deliver successfully the message to the subscriber. In my case, the subscriber is a ksvc, and there is no retries if that ksvc returns an error.
Expected behavior
When configuring a broker with fields under spec.delivery
, the message should be redelivered (multiple retries) if the receiver (ksvc) returns an error.
To Reproduce
-
Have a fresh Kubernetes cluster with Knative Serving and Eventing installed via YAML files (version 1.4.0).
-
Install components to work with NATS (
natsjsm.yaml
, channel messaging layereventing-natss.yaml
, broker layermt-channel-broker.yaml
):
# https://github.com/knative-sandbox/eventing-natss/blob/release-1.4/config/broker/README.md
kubectl apply -f https://raw.githubusercontent.com/knative-sandbox/eventing-natss/knative-v1.4.0/config/broker/natsjsm.yaml
# https://knative.dev/docs/install/yaml-install/eventing/install-eventing-with-yaml/#optional-install-a-default-channel-messaging-layer
kubectl apply -f https://github.com/knative-sandbox/eventing-natss/releases/download/knative-v1.4.0/eventing-natss.yaml
# https://knative.dev/docs/install/yaml-install/eventing/install-eventing-with-yaml/#optional-install-a-broker-layer
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.4.0/mt-channel-broker.yaml
- Use
NatsJetStreamChannel
as default channel:
kubectl patch configmap/config-br-default-channel \
--namespace knative-eventing \
--patch '{"data":{"channel-template-spec": "apiVersion: messaging.knative.dev/v1alpha1\nkind: NatsJetStreamChannel"}}'
- Deploy a simple ksvc (
kn service create error-service --image=myimage --port=8080
) that always return an error 503 to test handling delivery failure. The code is the following:
package main
import (
"log"
"net/http"
)
func handler(w http.ResponseWriter, req *http.Request) {
log.Println("It does not work, returning 503")
http.Error(w, "Does not work", 503)
}
func main() {
http.HandleFunc("/", handler)
http.ListenAndServe(":8080", nil)
}
- Create a
Broker
, aTrigger
pointing to that ksvc, and aPingSource
to send an event every minute:
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: default
namespace: default
annotations:
eventing.knative.dev/broker.class: MTChannelBasedBroker
spec:
config:
apiVersion: v1
kind: ConfigMap
name: config-br-default-channel # NatsJetStreamChannel defined in the ConfigMap
namespace: knative-eventing
delivery:
retry: 5
backoffPolicy: exponential
backoffDelay: "PT1S"
---
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: my-service-trigger
namespace: default
spec:
broker: default
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: error-service
---
apiVersion: sources.knative.dev/v1
kind: PingSource
metadata:
name: ping
namespace: default
spec:
schedule: "* * * * *"
contentType: "application/json"
data: '{"msg": "ping"}'
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: default
- Wait for the logs of
error-service
:
2022/05/29 14:22:00 It does not work, returning 503
2022/05/29 14:23:00 It does not work, returning 503
2022/05/29 14:24:00 It does not work, returning 503
As you can see, in the logs of the error-service
pod, there is no sign that the message is send again after an error (there is one log per minute, not 1 + 5 retries as defined in the Broker
spec.delivery.retry
field.
Knative release version
v1.4.0
Additional context
I have also tried with NatssChannel (https://github.com/knative-sandbox/eventing-natss/blob/release-1.4/config/broker/README.md#1-nats-streaming-deprecation-notice-) and got the same result.
The docs (https://github.com/knative-sandbox/eventing-natss/blob/knative-v1.4.0/config/README.md#nats-streaming-channels) states:
If downstream rejects an event, that request is attempted again.
And the knative docs states that Nats Channels does not support any of the delivery
fields (https://knative.dev/docs/eventing/event-delivery/#channel-support):
So I don't know if this is the normal behavior or if I'm doing something wrong or misunderstanding something.
But what I would like to do is retrying the PingSource
event if the ksvc returns an error.
Thanks for your help 🙌
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
/remove-lifecycle stale
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.
MTChannelBasedBroker
retries are delegated to the Channel implementation, so it depends on the implementation.
I see #376 is merged, maybe @astelmashenko can clarify where things are on the retry front for the Nats Channel ?
maybe @astelmashenko can clarify where things are on the retry front for the Nats Channel ?
Retries should be working, do you mean I need to test if it's working with retries configured?
@astelmashenko yes, if you can reproduce this issue with the newer nats channel versions or not
@pierDipi , yes, it is working properly now on 1.3.5 version. I see that this issue is for 1.4 there are no fixes.
Thanks!
so, are you planning to port the fix to any version 1.4+?
I can create MR 1.4/1.5 from #376 as patch. Not sure if I have time to test 1.4/1.5 versions locally.
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.