thdiaman / SequenceExtractor

Statement Sequence Extractor for Java Source Code Snippets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SequenceExtractor: Statement Sequence Extractor for Java Source Code Snippets

SequenceExtractor is a statement sequence extractor for Java source code snippets. The tool allows exporting the sequence of statements of snippets in a list format. It can be used as a library either from Java or from Python using the Python binding.

Using as a library

Import the library in your code. Then, you can parse snippets as follows:

String sequence = SequenceExtractor.extractSequence(""
						+ "JFrame frame = new JFrame(\"myframe\");\n"
						+ "JPanel panel = new JPanel()\n;"
						+ "Container pane = frame.getContentPane();\n"
						+ "GridLayout layout = new GridLayout(2,2);\n"
						+ "panel.setLayout(layout);\n"
						+ "panel.add(upperLeft);\n"
						+ "panel.add(upperRight);\n"
						+ "panel.add(lowerLeft);\n"
						+ "panel.add(lowerRight);\n"
						+ "pane.add(panel);\n"
						);

The result is a list with the sequence of calls for the snippet. For the above example the result is:

[CI_JFrame, CI_JPanel, FC_Container, CI_GridLayout, FC_void, FC_void, FC_void, FC_void, FC_void, FC_void]

There are three types of commands:

  • object instantiations (CI)
  • assignments (AM)
  • function calls (FC)

There are also certain options when extracting the snippets provided as parameters of the extractSequence function. These are:

  • keepFunctionCallTypes: denotes whether to output also the objects performing the function calls (instead of only the return types), default is false.
  • keepLiterals: denotes if commands with literals (primitive types) should be extracted, or discarded, default is false.
  • keepBranches: denotes if all branch paths should be kept, or only the first path of each branch.
  • outputTree: denotes if the output should be a tree, or a sequence.
  • flattenOutput: denotes if the output should be flattened, i.e. all paths to be merged in a single sequence, or different paths should be retained.
  • addUniqueIDs: denotes if statements should have unique IDs, default is false.

Using in Python

SequenceExtractor also has python bindings. Using the python wrapper is simple. At first, the library has to be imported and the SequenceExtractor object has to be initialized given the path to the jar of the library and the options to keep function call types (keep_function_call_types), keep literals (keep_literals), keep branches (keep_branches), output as a tree or sequence (output_tree), whether the output should be flattened (flatten_output), and whether unique IDs should be added (add_unique_ids):

sequence_extractor = SequenceExtractor("path/to/SequenceExtractor-0.4.jar", False, False, True, False, True)

After that, you can parse snippets as follows:

sequence = sequence_extractor.parse_snippet(
			"JFrame frame = new JFrame(\"myframe\");\n" +
			"JPanel panel = new JPanel();\n" +
			"Container pane = frame.getContentPane();\n" +
			"GridLayout layout = new GridLayout(2,2);\n" +
			"panel.setLayout(layout);\n" +
			"panel.add(upperLeft);\n" +
			"panel.add(upperRight);\n" +
			"panel.add(lowerLeft);\n" +
			"panel.add(lowerRight);\n" +
			"pane.add(panel)\n;"
	)

Note that after using the library, you have to close the SequenceExtractor object using function close, i.e.:

sequence_extractor.close()

About

Statement Sequence Extractor for Java Source Code Snippets

License:Other


Languages

Language:Java 86.7%Language:Python 13.3%