jgera / plagiarism

Plagiarism detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

plagiarism

Requirements:

Usage:

To search for plagiated documents in google:

main.py localfile.txt
main.py localfile.pdf
main.py http://example.ru/somefile.txt
main.py ftp://example.com/somefile.pdf

Press Ctrl-C to skip any file you don't want to test

To compare two documents:

main.py localfile.txt ftp://example.com/somefile.pdf
main.py http://example.ru/somefile.txt localfile.pdf
main.py localfile1.pdf localfile2.pdf

Notice, that somefile.txt must have 'utf-8' encoding. To change encoding search for data.decode('utf-8') in plagiarism.py

English or russian documents expected. For other languages just change global langs variable in main.py

About

Plagiarism detection