acsicuib / YAFS

Yet Another Fog Simulator (YAFS)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Behavior on overload

Infrasonics opened this issue · comments

I came across a situation where YAFS indicates a proper run, yet reality says otherwise.

This is based on the EEG-Tractor Beam example and was only modified to fit to the available testbed. The links are given as measured on the testbed. The model appears not to handle concurrent messages correctly when the arrival interval is shorter than the processing time.

The output (of the example below) shows that the service time is constant at 1.5 seconds, yet in reality the service time becomes infinitely long since the server gets overloaded with requests. The overload is already obvious at closer inspection as every message takes three billion instructions to process yet the server/cloud only computes two billion instructions per second. Therefore 1.5 might be correct for the first message yet the client sends a new message every second. Therefore this leads to a DDoS scenario with a single client already; the results do not indicate that behavior in any way.

A reimplementation of the shown scenario showed exactly that behavior.

I used the following scenario file -- which is intentionally not an MWE to avoid incorrect use of the YAFS framework:

 import random
 
 from yafs.core import Sim
 from yafs.application import Application, Message
 
 from yafs.population import *
 from yafs.topology import Topology
 
 from simpleSelection import MinimunPath
 from simplePlacement import CloudPlacement
 from yafs.stats import Stats
 from yafs.distribution import deterministicDistribution
 import time
 import numpy as np
 
 from sys import argv
 from os import path
 
 RANDOM_SEED = 1
 
 RESULT_PATH = "Results_" + path.splitext(path.basename(argv[0]))[0]
 
 
 def create_application():
     a = Application(name="SimpleCase")
 
     a.set_modules(
         [
             {"Sensor": {"Type": Application.TYPE_SOURCE}},
             {"ServiceA": {"RAM": 4000000000, "Type": Application.TYPE_MODULE}},
         ]
     )
     m_client = Message(
         "M.client", "Sensor", "ServiceA", instructions=3 * 10**9, bytes=1000
     )
 
     a.add_source_messages(m_client)
     a.add_service_module("ServiceA", m_client)
 
     return a
 
 
 def create_json_topology():
     topology_json = {}
     topology_json["entity"] = []
     topology_json["link"] = []
 
     cloud_dev = {
         "id": 0,
         "model": "cloud",
         "mytag": "cloud",
         "IPT": 2.0 * 10**9,
         "RAM": 4 * 10**9,
         "COST": 3,
         "WATT": 20.0,
     }
     sensor_dev = {
         "id": 1,
         "model": "sensor-device",
         "IPT": 2.0 * 10**9,
         "RAM": 4 * 10**9,
         "COST": 3,
         "WATT": 20.0,
     }
 
     link1 = {"s": 0, "d": 1, "BW": 100 * 10**6, "PR": 0.000155}
 
     topology_json["entity"].append(cloud_dev)
     topology_json["entity"].append(sensor_dev)
     topology_json["link"].append(link1)
 
     return topology_json
 
 
 # @profile
 def main(simulated_time):
 
     random.seed(RANDOM_SEED)
     np.random.seed(RANDOM_SEED)
 
     t = Topology()
     t_json = create_json_topology()
     t.load(t_json)
     t.write("network.gexf")
 
     app = create_application()
 
     placement = CloudPlacement(
         "onCloud"
     )  # it defines the deployed rules: module-device
     placement.scaleService({"ServiceA": 1})
 
     pop = Statical("Statical")
     pop.set_sink_control(
         {
             "model": "actuator-device",
             "number": 1,
             "module": app.get_sink_modules(),
         }
     )
 
     dDistribution = deterministicDistribution(name="Deterministic", time=1)
     pop.set_src_control(
         {
             "model": "sensor-device",
             "number": 1,
             "message": app.get_message("M.client"),
             "distribution": dDistribution,
         }
     )
 
     selectorPath = MinimunPath()
 
     """ SIMULATION ENGINE """
     stop_time = simulated_time
     s = Sim(t, default_results_path=RESULT_PATH)
     s.deploy_app(app, placement, pop, selectorPath)
     s.run(stop_time, show_progress_monitor=False)
 
 
 if __name__ == "__main__":
     import logging.config
     import os
 
     logging.config.fileConfig(os.getcwd() + "/logging.ini")
 
     start_time = time.time()
     main(simulated_time=60)
 
     print("\n--- %s seconds ---" % (time.time() - start_time))
 
     m = Stats(defaultPath=RESULT_PATH)  # Same name of the results
     time_loops = [["M.client"]]
     m.showResults2(1000, time_loops=time_loops)

Thank you for the comment.
I thought the behaviour on the results is right. The implementation models an M/M/1 system, where the length of the service queue grows due to the number of arrivals being bigger than the service time and in this way, it also grows the response time. The service time remains constant and the network link is not overstressed either.

We can observe it in the results_.csv file (I have removed some columns for readability issues):

id TOPO.src TOPO.dst service time_in time_out time_emit time_reception
1 1 0 1.5 1.00015500001 2.50015500001 1.0 1.00015500001
2 1 0 1.5 2.50015500001 4.00015500001 2.0 2.00015500001
3 1 0 1.5 4.00015500001 5.50015500001 3.0 3.00015500001
4 1 0 1.5 5.50015500001 7.00015500001 4.0 4.00015500001
5 1 0 1.5 7.00015500001 8.50015500001 5.0 5.00015500001
6 1 0 1.5 8.50015500001 10.00015500001 6.0 6.00015500001
7 1 0 1.5 10.00015500001 11.50015500001 7.0 7.00015500001
8 1 0 1.5 11.50015500001 13.00015500001 8.0 8.00015500001
9 1 0 1.5 13.00015500001 14.50015500001 9.0 9.00015500001
10 1 0 1.5 14.50015500001 16.00015500001 10.0 10.00015500001
11 1 0 1.5 16.00015500001 17.50015500001 11.0 11.00015500001
12 1 0 1.5 17.50015500001 19.00015500001 12.0 12.00015500001
13 1 0 1.5 19.00015500001 20.50015500001 13.0 13.00015500001
14 1 0 1.5 20.50015500001 22.00015500001 14.0 14.00015500001
15 1 0 1.5 22.00015500001 23.50015500001 15.0 15.00015500001
16 1 0 1.5 23.50015500001 25.00015500001 16.0 16.00015500001
17 1 0 1.5 25.00015500001 26.50015500001 17.0 17.00015500001
18 1 0 1.5 26.50015500001 28.00015500001 18.0 18.00015500001
19 1 0 1.5 28.00015500001 29.50015500001 19.0 19.00015500001
20 1 0 1.5 29.50015500001 31.00015500001 20.0 20.00015500001
21 1 0 1.5 31.00015500001 32.50015500001 21.0 21.00015500001
22 1 0 1.5 32.50015500001 34.00015500001 22.0 22.00015500001
23 1 0 1.5 34.00015500001 35.50015500001 23.0 23.00015500001
24 1 0 1.5 35.50015500001 37.00015500001 24.0 24.00015500001
25 1 0 1.5 37.00015500001 38.50015500001 25.0 25.00015500001
...
40 1 0 1.5 59.50015500001 61.00015500001 40.0 40.00015500001

For example,

  • at 25th row/request:
    the service time is = 38.50015500001 - 37.00015500001 = 1.5
    the waiting time is = 37.00015500001 - 25.00015500001 = 11.99999
    the response time is = 38.50015500001- 25.0 = 13.5001550

  • at 40th row:
    the service time is = 61.00015500001 - 59.50015500001 = 1.5
    the waiting time is = 59.50015500001 - 40.00015500001 = 19.5
    the response time is = 61.00015500001- 40.0 = 21.00015500

I hope I have clarified the interpretation of the results for each request recorded, and for the proposed model (an M/M/1).
In any case, I leave the thread open in case I misunderstood your question.
Best

  • Note: I've considered the network latency in the response time, but more or less the results are coherent with the idea.