How can I use this code to profile my GC?
guimaluf opened this issue · comments
Hi all,
I read the article 'An Experimental Evaluation of GC on Big Data Applications' and I'm willing to reproduce part of it in my setup.
Isn't clear to me how can I use the SparkProfile.jar package. How it will get GC stats, where it will print output, etc.
I would like to thank you for the research and I appreciate any help
Hi guimaluf, thanks for your interest in our work.
I'm sorry that it is a little complex to use this profiler, because I developed a number of parsers and analyzers to obtain statistics from task logs, gc logs, CPU logs, etc. Some of them are used to obtain the statistical results as presented in our paper, while others are obsolete. The usage of this profiler is as follows.
After running a Spark application, e.g., app-20170623113634-0010
, we first run SparkAppJsonSaver.java
to save this application's performance metrics (e.g., application execution time, stage metrics, task metrics in each stage, executor metrics, etc.) via REST APIs (referred to http://spark.apache.org/docs/latest/monitoring.html) to a directory (e.g., APPdir). This SparkAppJsonSaver.java
also fetches the GC log from each executor to a file (e.g., to be APPdir/executors/executor-id/stdout), if we enable the executor to output GC log via GC commands as spark.executor.extraJavaOptions="-XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime
. Note that each executor is a JVM and the GC activities are logged in executor's log stdout
.
After that, we use SparkAppProfiler.java
to analyze and output interesting statistics. In particular, for GC analysis, we use the gc log parsers in src/main/generalGC
to parse the GC log of each executor into formatted statistics, such as
[Young](YGC) time = 2.083, beforeGC = 126.4658203125, afterGC = 14.2392578125, allocated = 151.25, gcPause = 0.0663858s, gcCause = GC (Allocation Failure)
[Young](YGC) time = 2.877, beforeGC = 141.3876953125, afterGC = 9.5703125, allocated = 151.25, gcPause = 0.1134074s, gcCause = GC (Allocation Failure)
[Young](FGC) time = 3.01, beforeGC = 26.33984375, afterGC = 26.33984375, allocated = 151.25, gcPause = 0.0014977s, gcCause = GC (CMS Initial Mark)
[Young](YGC) time = 4.527, beforeGC = 144.0703125, afterGC = 10.8642578125, allocated = 151.25, gcPause = 0.1209985s, gcCause = GC (Allocation Failure)
This formatted statistics records the GC pause time and related memory usage after each young/old GC pause.
Finally, we can use the python code in src/python
to plot the GC curves as that in Figure 7 in our paper.
In general, this profiler covers almost all the fine-grained metrics of a Spark application, including the metrics of application, stages, tasks, executors, etc. If you focus on analyzing the GC logs of executors, please refer to the parsers in src/main/generalGC
. If you only want to observe the GC metrics of some executors, you can also refer to https://gceasy.io/ for GUI-based general GC analyzer.