Behavior on overload

Question

Behavior on overload

Infrasonics opened this issue 2 years ago · comments

I came across a situation where YAFS indicates a proper run, yet reality says otherwise.

This is based on the EEG-Tractor Beam example and was only modified to fit to the available testbed. The links are given as measured on the testbed. The model appears not to handle concurrent messages correctly when the arrival interval is shorter than the processing time.

The output (of the example below) shows that the service time is constant at 1.5 seconds, yet in reality the service time becomes infinitely long since the server gets overloaded with requests. The overload is already obvious at closer inspection as every message takes three billion instructions to process yet the server/cloud only computes two billion instructions per second. Therefore 1.5 might be correct for the first message yet the client sends a new message every second. Therefore this leads to a DDoS scenario with a single client already; the results do not indicate that behavior in any way.

A reimplementation of the shown scenario showed exactly that behavior.

I used the following scenario file -- which is intentionally not an MWE to avoid incorrect use of the YAFS framework:

 import random
 
 from yafs.core import Sim
 from yafs.application import Application, Message
 
 from yafs.population import *
 from yafs.topology import Topology
 
 from simpleSelection import MinimunPath
 from simplePlacement import CloudPlacement
 from yafs.stats import Stats
 from yafs.distribution import deterministicDistribution
 import time
 import numpy as np
 
 from sys import argv
 from os import path
 
 RANDOM_SEED = 1
 
 RESULT_PATH = "Results_" + path.splitext(path.basename(argv[0]))[0]
 
 
 def create_application():
     a = Application(name="SimpleCase")
 
     a.set_modules(
         [
             {"Sensor": {"Type": Application.TYPE_SOURCE}},
             {"ServiceA": {"RAM": 4000000000, "Type": Application.TYPE_MODULE}},
         ]
     )
     m_client = Message(
         "M.client", "Sensor", "ServiceA", instructions=3 * 10**9, bytes=1000
     )
 
     a.add_source_messages(m_client)
     a.add_service_module("ServiceA", m_client)
 
     return a
 
 
 def create_json_topology():
     topology_json = {}
     topology_json["entity"] = []
     topology_json["link"] = []
 
     cloud_dev = {
         "id": 0,
         "model": "cloud",
         "mytag": "cloud",
         "IPT": 2.0 * 10**9,
         "RAM": 4 * 10**9,
         "COST": 3,
         "WATT": 20.0,
     }
     sensor_dev = {
         "id": 1,
         "model": "sensor-device",
         "IPT": 2.0 * 10**9,
         "RAM": 4 * 10**9,
         "COST": 3,
         "WATT": 20.0,
     }
 
     link1 = {"s": 0, "d": 1, "BW": 100 * 10**6, "PR": 0.000155}
 
     topology_json["entity"].append(cloud_dev)
     topology_json["entity"].append(sensor_dev)
     topology_json["link"].append(link1)
 
     return topology_json
 
 
 # @profile
 def main(simulated_time):
 
     random.seed(RANDOM_SEED)
     np.random.seed(RANDOM_SEED)
 
     t = Topology()
     t_json = create_json_topology()
     t.load(t_json)
     t.write("network.gexf")
 
     app = create_application()
 
     placement = CloudPlacement(
         "onCloud"
     )  # it defines the deployed rules: module-device
     placement.scaleService({"ServiceA": 1})
 
     pop = Statical("Statical")
     pop.set_sink_control(
         {
             "model": "actuator-device",
             "number": 1,
             "module": app.get_sink_modules(),
         }
     )
 
     dDistribution = deterministicDistribution(name="Deterministic", time=1)
     pop.set_src_control(
         {
             "model": "sensor-device",
             "number": 1,
             "message": app.get_message("M.client"),
             "distribution": dDistribution,
         }
     )
 
     selectorPath = MinimunPath()
 
     """ SIMULATION ENGINE """
     stop_time = simulated_time
     s = Sim(t, default_results_path=RESULT_PATH)
     s.deploy_app(app, placement, pop, selectorPath)
     s.run(stop_time, show_progress_monitor=False)
 
 
 if __name__ == "__main__":
     import logging.config
     import os
 
     logging.config.fileConfig(os.getcwd() + "/logging.ini")
 
     start_time = time.time()
     main(simulated_time=60)
 
     print("\n--- %s seconds ---" % (time.time() - start_time))
 
     m = Stats(defaultPath=RESULT_PATH)  # Same name of the results
     time_loops = [["M.client"]]
     m.showResults2(1000, time_loops=time_loops)

Isaac.Lera · Answer 1 · Wed Oct 19 2022 19:34:36 GMT+0800 (China Standard Time)

Thank you for the comment.
I thought the behaviour on the results is right. The implementation models an M/M/1 system, where the length of the service queue grows due to the number of arrivals being bigger than the service time and in this way, it also grows the response time. The service time remains constant and the network link is not overstressed either.

We can observe it in the results_.csv file (I have removed some columns for readability issues):

id	TOPO.src	TOPO.dst	service	time_in	time_out	time_emit	time_reception
1	1	0	1.5	1.00015500001	2.50015500001	1.0	1.00015500001
2	1	0	1.5	2.50015500001	4.00015500001	2.0	2.00015500001
3	1	0	1.5	4.00015500001	5.50015500001	3.0	3.00015500001
4	1	0	1.5	5.50015500001	7.00015500001	4.0	4.00015500001
5	1	0	1.5	7.00015500001	8.50015500001	5.0	5.00015500001
6	1	0	1.5	8.50015500001	10.00015500001	6.0	6.00015500001
7	1	0	1.5	10.00015500001	11.50015500001	7.0	7.00015500001
8	1	0	1.5	11.50015500001	13.00015500001	8.0	8.00015500001
9	1	0	1.5	13.00015500001	14.50015500001	9.0	9.00015500001
10	1	0	1.5	14.50015500001	16.00015500001	10.0	10.00015500001
11	1	0	1.5	16.00015500001	17.50015500001	11.0	11.00015500001
12	1	0	1.5	17.50015500001	19.00015500001	12.0	12.00015500001
13	1	0	1.5	19.00015500001	20.50015500001	13.0	13.00015500001
14	1	0	1.5	20.50015500001	22.00015500001	14.0	14.00015500001
15	1	0	1.5	22.00015500001	23.50015500001	15.0	15.00015500001
16	1	0	1.5	23.50015500001	25.00015500001	16.0	16.00015500001
17	1	0	1.5	25.00015500001	26.50015500001	17.0	17.00015500001
18	1	0	1.5	26.50015500001	28.00015500001	18.0	18.00015500001
19	1	0	1.5	28.00015500001	29.50015500001	19.0	19.00015500001
20	1	0	1.5	29.50015500001	31.00015500001	20.0	20.00015500001
21	1	0	1.5	31.00015500001	32.50015500001	21.0	21.00015500001
22	1	0	1.5	32.50015500001	34.00015500001	22.0	22.00015500001
23	1	0	1.5	34.00015500001	35.50015500001	23.0	23.00015500001
24	1	0	1.5	35.50015500001	37.00015500001	24.0	24.00015500001
25	1	0	1.5	37.00015500001	38.50015500001	25.0	25.00015500001
...
40	1	0	1.5	59.50015500001	61.00015500001	40.0	40.00015500001

For example,

at 25th row/request:
the service time is = 38.50015500001 - 37.00015500001 = 1.5
the waiting time is = 37.00015500001 - 25.00015500001 = 11.99999
the response time is = 38.50015500001- 25.0 = 13.5001550
at 40th row:
the service time is = 61.00015500001 - 59.50015500001 = 1.5
the waiting time is = 59.50015500001 - 40.00015500001 = 19.5
the response time is = 61.00015500001- 40.0 = 21.00015500

I hope I have clarified the interpretation of the results for each request recorded, and for the proposed model (an M/M/1).
In any case, I leave the thread open in case I misunderstood your question.
Best

Note: I've considered the network latency in the response time, but more or less the results are coherent with the idea.