Bangla_Plagiarism_Checker

To develop a plagiarism checker I’m following microservice architecture. Here I have 5 types of service in the backend.

Data Scrapper service: To collect data from the different sources we need a service that is only responsible for web scraping and sending data for data classification.
Data Classification Service: From here we will develop our incremental model to classify our data.
Input Data classification Service: This service will classify user input and define their family.
Main Server: responsible for accepting data from client/user.
Plagiarism Service: Here we will implement multiple types of plagiarism checker algorithms and analyze their performance.

description

Here I've used three different approach for checking plagiarism .

Cosine Similarity
Jaccard Similarity
Bert

For Bert, we've used Universal Sentence Encoder model which is a model that encodes text into 512-dimensional embeddings and tensorflow/tfjs-node for native TensorFlow execution in backend JavaScript applications under the Node.js runtime.

About

Languages

Language:Jupyter Notebook 92.8%Language:JavaScript 3.8%Language:Python 2.4%Language:CSS 0.8%Language:HTML 0.2%