(feat): Add chaos experiments to test the functionalities of coreDNS components

Question

(feat): Add chaos experiments to test the functionalities of coreDNS components

imrajdas opened this issue 5 years ago · comments

Raj Das commented 5 years ago

Request for the chaos experiments of coreDNS components. Here is the list of failure cases:

Pod Delete
- Chaos Injection can be done by deleting coreDNS pod in kube-sytem namespace using the litmus or powerfulseal lib
Spike of HTTP requests
- Chaos Injection:
  - Create an external pod ex- nginx,
  - Create a service by exposing nginx pod
  - Do an infinite HTTP request to the nginx service util the chaos duration.
ndots problem
- ndots is a problem when DNS try to resolve Non-FQDN more than ndots threshold
- For more info: https://pracucci.com/kubernetes-dns-resolution-ndots-options-and-why-it-may-affect-application-performances.html
Caching Failure
- This can be done easily, by scaling the coreDNS deployment to "0" and then, scale it back to the desired number.
  kubectl scale deployment.apps/coredns -n kube-system --replicas=0
  kubectl scale deployment.apps/coredns -n kube-system --replicas=2
Delay in the HTTP requests
- Chaos Injection can be done using Pumba network chaos lib

Common Checks:

By deploying an external application and service and check the liveness of the application by calling the service

Raj Das · Answer 1 · Wed Dec 18 2019 15:12:47 GMT+0800 (China Standard Time)

Issue added to the 1.0 project board

Karthik Satchitanand · Answer 2 · Wed Dec 18 2019 17:23:06 GMT+0800 (China Standard Time)

@rajdas98 -- a one-liner against each of the chaos experiments would be very helpful!

Karthik Satchitanand · Answer 3 · Sat Jan 11 2020 18:39:21 GMT+0800 (China Standard Time)

This issue is a blanket feature request for different coreDNS scenarios/experiments, which will span across multiple milestones. The first step towards this is available as a coredns-pod-delete scenario available here: https://github.com/litmuschaos/litmus/tree/master/experiments/coredns/pod_delete

Karthik Satchitanand · Answer 4 · Mon Mar 16 2020 16:36:31 GMT+0800 (China Standard Time)

@rajdas98 which experiments are we planning to consider next?

Raj Das · Answer 5 · Wed Mar 25 2020 01:57:07 GMT+0800 (China Standard Time)

@ksatchit We can pick the caching failure experiment.

Raj Das · Answer 6 · Wed Mar 25 2020 02:01:20 GMT+0800 (China Standard Time)

coreDNS Pod delete experiment was added by this PR #1029

Karthik Satchitanand · Answer 7 · Tue Sep 27 2022 12:06:19 GMT+0800 (China Standard Time)

Litmus has enhanced its fault suite along with supported probe types, which allows users to test any application workload app-native constraints/health-validations, thereby removing the need for separate workload-specific experiments to a large extent.