How to improve performance of Rholang map
hilltracer opened this issue · comments
Overview
Investigation performed as part of the work to accelerate of primitive structures (List, Set, Map) #3607. In this work we investigate RhoMap:
The purpose is to determine the slowest procedures with Map and try to optimize these procedures.
Testing methodology (link to commit)
Timing parameters evaluated with help the next rho code:
new return, loop in {
contract loop(@n, @acc, ret) = {
if (n == 0) ret!(acc)
else loop!(n - 1, acc.set(n, n), *ret)
} |
loop!($number, {}, *return)
}
Where $number - number of elements which appending in Map (100, 200, ..., 500).
Description of optimization steps
Step 0: Basic implementation (link to commit)
This is the current implementation of the map, which contains no changes.
Step 1: Change SortedParMap implementation (link to commit)
- Add sorting for any new pars.
- Remove re-sort and re-hash operations from basic map operation (append, concatination, remove, etc.).
Step 2: Optimize sorting, evaluate and substitute (link to commit)
- Delete sort() pars operation in Map because data in Map is already sorted when Map is created.
- Delete eval() operation if this map contains only simple data (if there aren't data for evaluate).
- Delete substitute() operation if this map contains only simple data (if there aren't data for substitute).
Step 3: Remove serializing from produce (link to commit)
The next optimization step - it's removing the protobuf serialization of map data from produce. Serialization is required for calculating hash which is used for forming an event log.
Step 4: Remove charging (link to commit)
Charging is a slow operation because for calculating cost serialization is used:
Step 5: Remove sorting from Map (link to commit)
The final step of optimization is removing sorting from Map. The data itself continues to be sorted when added to the map. The sorting of the order of this data in the map is removed.
Steps 3, 4, and 5 brakes the current implementation but they can be implemented as part of a new version of Rholang 1.1.
Main results ($number vs. time of evaluate (ms))
$number | Step 0 | Step 1 | Step 2 | Step 3 | Step 4 | Step 5 |
---|---|---|---|---|---|---|
100 | 2786 | 2808 | 1671 | 1178 | 821 | 788 |
200 | 9194 | 9086 | 3731 | 2544 | 1097 | 853 |
300 | 20002 | 18982 | 7241 | 3628 | 1503 | 879 |
400 | 35650 | 32019 | 12118 | 5647 | 1997 | 1002 |
500 | 53707 | 48972 | 18387 | 8828 | 3017 | 1197 |
Conclusion - ways to improve performance
Remove extra operation Eval(), Substitute() and Sort()
- For this purpose need to add in each Par flag:
needEval
,needSubstitute
,needSort
. - If these flags are set, don't perform operations Eval(), Substitute() or Sort().
- Fill these flags when Pars are created. For multielement pars - set flags if at least one element has a flag set. This speeds up an analysis of nested Pars.
- When analyzing, take into account that if the data is loaded from RSpace, they are already sorted, substituted, and evaluated.
- After substitution or evaluation multielement par clear flag for only changed elements.
Remove extra operation send() and receive()
- Send() the slowest operation because it's required: sorting, serializing, hashing, appending in eventLog.
- Use send() and receive() only for work with data in RSpace.
- If required to perform the local operation (for example loop, or creating and using local value) don't use send() and receive()
- There are two ways:
- Implement a new version of Rholang 1.1
- Add the ability to use third-party code in Rholang scripts.
Speedup charging
- Develop a new speed algorithm for calculation cost without serialization.
- For multielement pars calculate the cost of elements at the time of their creation and store this cost.
Optimization of sorting
Multi element par must be sorted before:
1) Storing in HotStore
2) Calculation of hash (for example for comparison with other pars)
3) Evaluation or substitution
Optimization for 1)
No way to optimize. Need to remove extra operations send().
Optimization for 2)
The possibility of optimization depends on the scala-types of data structures used. For example, if Map
is used, then the hash calculation will be the same regardless of the order of the elements. If a List
is used, then sorting is necessary. So sorting for RhoMap may be removed because it uses Map
.
Optimization for 3)
Let's see two unsorted same rhoMaps:
map1 = {1:"data1", 2:"data2", (1+1):"data3"}
map2 = {1:"data1", (1+1):"data3", 2:"data2"}
Eval(map1) // {1:"data1", 2:"data3"}
Eval(map2) // {1:"data1", 2:"data2"}
This example shows us the importance of sorting before Evaluation.
But if we consider this issue in detail: sorting is needed only if we have conflicts after evaluation. Thus, this sort can be performed lazily only in the event of a conflict, and only for conflicting elements.
Third-party code (the second way)
- Rholang is used for work with RSpace.
- For work with data in RAM third-party code is used (scala, java, c++, etc.)
Example:
new return in {
return!(
#SCALA_START
val r = (1 to $number).foldLeft(Map.empty) { (acc, i) =>
acc + (i -> i)
}
#SCALA_END(r) //returning value from third-party block
)
}
- Need converter Rholang type to third-party type.
- Need charging algorithm for calculation of third-party code.
- Need to have both block directions: third-party in Rholang and Rholang in third-party.
Appendix: Comparison with summing
This experiment compares the execution time of the main testing code (see testing methodology) with the next code of summation:
new return, loop in {
contract loop(@n, @acc, ret) = {
if (n == 0) ret!(acc)
else loop!(n - 1, acc + n, *ret)
} |
loop!($number, 0, *return)
}
Where $number - the number of elements which sum in the loop (100, 200, ..., 7000).
$number | map construction (ms) | sum (ms) |
---|---|---|
100 | 976 | 1128 |
200 | 840 | 977 |
300 | 869 | 1060 |
400 | 964 | 1040 |
500 | 1220 | 1490 |
600 | 1050 | 1230 |
700 | 1306 | 1479 |
800 | 1541 | 1429 |
900 | 1810 | 1569 |
1000 | 1958 | 1762 |
1100 | 1941 | 2072 |
1200 | 2271 | 2026 |
1300 | 2286 | 2042 |
1400 | 2655 | 2144 |
1500 | 2650 | 2218 |
1600 | 3095 | 2399 |
1700 | 3158 | 2710 |
1800 | 3464 | 2699 |
1900 | 3841 | 2838 |
2000 | 3983 | 3698 |
2100 | 4358 | 3502 |
2200 | 4536 | 3356 |
2300 | 4916 | 3820 |
2400 | 5186 | 3704 |
2500 | 5332 | 3859 |
2600 | 5980 | 4417 |
2700 | 6590 | 4290 |
2800 | 7526 | 4105 |
2900 | 6859 | 4339 |
3000 | 7629 | 4423 |
3100 | 8043 | 5099 |
3200 | 8177 | 4999 |
3300 | 9220 | 5167 |
3400 | 9688 | 5149 |
3500 | 9303 | 5824 |
3600 | 10606 | 5679 |
3700 | 10634 | 5977 |
3800 | 10687 | 6133 |
3900 | 10381 | 6192 |
4000 | 10950 | 7027 |
4100 | 11487 | 6936 |
4200 | 14072 | 7109 |
4300 | 13326 | 7251 |
4400 | 14241 | 6772 |
4500 | 16116 | 6745 |
4600 | 14159 | 8152 |
4700 | 14350 | 7173 |
4800 | 14862 | 7297 |
4900 | 15039 | 7003 |
5000 | 15552 | 8287 |
5100 | 15450 | 7498 |
5200 | 16289 | 7866 |
5300 | 16674 | 8710 |
5400 | 17213 | 8239 |
5500 | 17995 | 8955 |
5600 | 19605 | 9384 |
5700 | 20052 | 9393 |
5800 | 21167 | 10485 |
5900 | 21728 | 10791 |
6000 | 22343 | 9385 |
6100 | 22819 | 8753 |
6200 | 22815 | 9830 |
6300 | 23904 | 9884 |
6400 | 23233 | 10463 |
6500 | 25853 | 10151 |
6600 | 24536 | 10782 |
6700 | 26438 | 9988 |
6800 | 28750 | 10457 |
6900 | 28572 | 11221 |
7000 | 29008 | 10640 |