# This file is part of the software similarity tester SIM. # Written by Dick Grune, Vrije Universiteit, Amsterdam. # $Id: README,v 2.18 2017-03-19 09:42:55 dick Exp $ These programs test for similar or equal stretches in one or more program or text files and can be used to detect common code or plagiarism. See sim.pdf. Checkers are available for C, C++, Java, Pascal, Modula-2, Lisp, Miranda and natural language text. >>>> NEW, March 2017: - C++ testing added >>>> NEW, May 2016: - better percentage computation >>>> NEW, Jan 2014: - 64-bit compatible - works also on 32-bit machines with software 64-bit emulator - accepts | as new-old separator ==== To install on any system with gcc, flex, cp, ln, echo, rm, and wc, or their equivalents, for example UNIX/Linux or MSDOS+MinGW: Unpack the archive sim_3_*.zip To compile and test, edit the Makefile to fit the local situation, and call: make test This will generate one executable called sim_c, the checker for C, and will run two small tests to show sample output. To install, examine the Makefile, edit BINDIR and MAN1DIR to sensible paths, and call make install To change defaults, adjust the file settings.par and recompile. ==== To install on MSDOS, if you don't have a C compiler, the archive sim_exe_3_*.zip contains: SIM_C.EXE similarity tester for C SIM_C++.EXE similarity tester for C++ SIM_JAVA.EXE similarity tester for Java SIM_PASC.EXE similarity tester for Pascal SIM_M2.EXE similarity tester for Modula-2 SIM_LISP.EXE similarity tester for Lisp SIM_MIRA.EXE similarity tester for Miranda SIM_TEXT.EXE similarity tester for text ==== To extend: To add another language L, write a file Llang.l along the lines of clang.l or the other *lang.l files, extend the Makefile and recompile. All knowledge about a given language L is located in Llang.l; the rest of the program expects each token to be a 16-bit character. Dick Grune email: dick@dickgrune.com http://www.dickgrune.com