saro0307 / AI-detector

Python program designed to distinguish between AI-generated and human-authored text, helping identify the source of written content based on unique characteristics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AI-detector

Machine Learning Model for Distinguishing Human vs. ChatGPT Text

Introduction

This repository contains the code and resources to build a machine learning model that can distinguish between text written by humans and text generated by ChatGPT or a similar AI model. This README file will guide you through the process of setting up and running the model.

Prerequisites

Before you begin, make sure you have the following installed on your system:

  • Python (version 3.6 or higher)
  • Required Python libraries (e.g., scikit-learn, pandas, numpy)
  • Jupyter Notebook (optional but recommended for code exploration)

You can install Python libraries using pip:

pip install scikit-learn pandas numpy

Getting Started

  1. Clone the Repository: Start by cloning this repository to your local machine:

    git clone https://github.com/your-username/chatgpt-human-detection.git
    cd chatgpt-human-detection
  2. Data Preparation: Prepare your dataset containing both human-written and ChatGPT-generated text. Ensure the data is well-structured and labeled appropriately (e.g., 'human' and 'chatgpt').

  3. Data Preprocessing: Use Jupyter Notebook or your preferred Python environment to preprocess the data. You may need to tokenize, vectorize, and split the dataset into training and testing sets.

  4. Model Building: Build and train your machine learning model. You can explore various algorithms such as logistic regression, support vector machines, or neural networks. Refer to the provided code and documentation for guidance.

  5. Model Evaluation: Evaluate the model's performance using metrics like accuracy, precision, recall, and F1-score. Fine-tune the model if necessary to achieve the desired accuracy.

Usage

Once you've built and trained your model, you can use it to classify text as either human-written or ChatGPT-generated. Here's how to make predictions with your model:

# Load your trained model (replace 'model_file.pkl' with your model file)
import pickle
model = pickle.load(open('model_file.pkl', 'rb'))

# Use the model to classify text
text_to_classify = "This is a test sentence."
prediction = model.predict([text_to_classify])

if prediction[0] == 'human':
    print("The text is likely human-written.")
else:
    print("The text is likely generated by ChatGPT.")

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Python program designed to distinguish between AI-generated and human-authored text, helping identify the source of written content based on unique characteristics.

License:MIT License


Languages

Language:Jupyter Notebook 100.0%