A bifactor approximation algorithm (ACP) to solve the heterogeneous cloudlet placement problem to guarantee a bounded latency and placement cost, while fully mapping user applications to appropriate cloudlets.
We aim to efficiently place cloudlets to specific locations in a region to serve the demands of all the end devices (IoT) that require edge services. We model the region as a two-dimensional space (grid), where cloudlets and devices can exist. The devices could be at any point in the space. On the other hand, we assume only a set of candidate points within the grid are available where the cloudlets can be placed and the devices can be best served from.
IEEE Transactions on Parallel and Distributed Systems
- Authors
- Dixit Bhatta
- Lena Mashayekhy
- IP and LP (solved using CPLEX library)
- OCP Cost - Optimal Cost Placement
- OCP Latency - Optimal Latency Placement
- LP Cost - LP Cost Placement
- GACP - Genetic Algorithm Based Cloudlet Placement
- ACP - Approximate Cloudlet Placement (our approach)
- Core Classes: Cloudlet, CandidatePoint, EndDevice
- Extended Classes (for implemanting ACP algorithm): NewCloudlet, NewCandidatePoint, NewEndDevice
- CPLEX Model: CplexCloudletPlacement, CplexLPCloudletPlacement
- Genetic Algorithm: GeneticCloudletPlacement
- Approximation Algorithm: ApproxLPRounding
- Complete Dataset used in the experiments is available in datasets directory
- Datasets can be reproduced by running classes in OpenDataToDataset
- Pass the integer argument (1-5) to
getDevices()
andgetCandidates()
methods to indicate different boroughs of NYC and generate respective base_device.csv and base_points.csv files. - The samples can be generated by running DatasetSampling (Note: change the
root
andtotal_num
variables to set the root path and number of rows in the base files. Also, change theoutpath
and filename depending on the dataset samples you want to create i.e., devices.csv or points.csv.) - For each sample, you can generate cloudlet.csv and cost.csv by running CalcCandidateCosts by passing the path of directory containing sub-sample to
setCloudletsAndCandidateCosts()
method. - Likewise, latencies.csv can be produced by setting the source directory in
main()
method of CalcLatency.
- Pass the integer argument (1-5) to
- Please note that the datasets prodcued using random sampling might lead to slightly different results. Use of provided dataset will reproduce exact results.
- Setup and Dependencies
- You can import the code as a standard Java Project into any IDE, or from command-line in a local diretory.
- The main depency required is
cplex.jar
available in external_lib directory. Add this as an external JAR to your project. CPLEX installation is also required for run. (The process differs depending on the IDE or command-line setup).
- MainRunner
- The
main
method runs different approaches based on the first input argument (1-4) torun()
method in the order below- OCP Cost
- OCP Latency
- ACP
- GACP
- These have been broken down into individual methods for smooth running. No need to change or update the code.
- The
- Special runtime considerations
- OCP Latency and ACP can be run for all samples without any special changes to the code.
- OCP Cost and GACP need additional arguments to run across different samples due to the machine limitations and convergence behavior of the algorithms.
- OCP Cost approach requires node limited solutions for larger instances since CPLEX cannot converge to optimal solutions for a very long time. For that, a Node Limited Solution is needed. The code is already configured with these node limits for OCP Cost run. No need to modify code for it. [More details in next section]
- GACP requires coverage values less than 1.0 for complex or larger instances since it may not converge for a long time when full coverage is expected. This value needs to be adjusted for specific datasets. The code is already configured with these thresholds for GACP run. No need to modify code for it.[More details in next section]
- OCP Node Limit Value
- Staten Island: Not Needed
- Bronx: 75,000
- Queens: 70,000
- Brooklyn: 30,000
- Manhattan: 5,500
- GACP Coverage
threshold
Value- Staten Island: 0.90 - 1.0 (For samples 1-30, in order: [1.0,1.0,0.90,0.95,0.95,0.95,0.95,0.95,0.95,1.0, \ 0.95,0.95,0.95,0.95,1.0,0.95,1.0,0.95,0.95,1.0, \ 0.95,1.0,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95])
- Bronx: 0.92 (for all)
- Queens: 0.90 (for all)
- Brooklyn: 0.87 (for all)
- Manhattan: 0.87 (for all)
results_summary
in each method for given approaches of MainRunner contains the results. It prints multiple lines to the console or log file where each row contains (in order): approach, cost, latency, total runtime and other relevant results.- The following results must be tabulated for all 30 samples for 5 boroughs and their mean value must be compared to our results.
- Cost
- Latency
- Additonal results such as coverage and solution gap will be consistent if the cost and latency values are reproducible. There is no need to additonally check them.
- The runtime depends on the machine where the code is run. The specifications used for running them are specified in the paper.
- The GACP is fed with LP Cost and Number of Cloudlets to speed up runtime. The LP Time from ACP results must be included for corresponding run in GACP to find its total runtime.
- Results can be plotted using Python (Seaborn) scripts available in scripts.
- Each file is descriptive in terms of output it is plotting. Some of them already contain the results data and you can simply run them to visualize results.