AustinBoyuJiang / ArtifAI

ArtifAI is designed to detect the potential origins of an artwork, whether generated by AI or created by human. It is dedicated to addressing copyright brought by generative artificial intelligence. It also provides gpt-based consultant for querying any copyright issue.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ArtifAI Detector

Check out the live deployment of the project on the ArtfAI website to see it in action!

ArtifAI website deployment

Project Description

ArtifAI is designed to detect the potential origins of an artwork, whether generated by AI or created by humans. It is dedicated to addressing copyright brought about by generative artificial intelligence. It also provides a GPT-based consultant for querying any copyright issue.

ArtifAI uniquely combines the power of artificial intelligence with an appreciation for human creativity, offering a robust platform for distinguishing between artworks created by humans and those generated by AI. As we navigate the complexities of copyright in the digital age, ArtifAI emerges as an essential tool for artists and the broader art community.

Project Roadmap

/ArtifAI - endpoint programs

/src - trained models for detecting images

/train - programs to train the model

/data - programs to fetch image data from online sources

Development Log

2023-09-10

  • Project Inception.

2023-09-18

  • Achieved successful extraction of image URLs from ArtStation through advanced web crawling techniques.

2023-10-31

  • Executed proficient image retrieval from designated URLs, ensuring efficient and accurate data acquisition.

  • Pioneered the implementation of a local JSON database, establishing a robust system for meticulous tracking and future utilization of collected data.

2023-11-09

  • Undertook a comprehensive reformatting of the JSON file to enhance data structure and readability.

  • Executed a thorough reorganization of the data collection program, aiming for increased efficiency and user-friendliness.

2023-11-12

  • Standardized image sizes to 256x256 for optimal balance between detail retention and computational efficiency.

  • Established train_size or training_size as variable names for the training-to-testing data ratio.

  • Explored various image storage formats and decided on using the .npy format for its efficiency in machine learning operations.

  • Initiated the development of the training program for the AI image classification project, targeting efficient data handling and model training strategies.

  • Implemented a data batch loading mechanism to manage large image datasets effectively, overcoming the limitations of computer memory capacity.

  • Revised the Python training program to incorporate dynamic data loading, significantly reducing memory usage during model training.

  • Enhanced the data loading function to ensure consistency across images, including converting grayscale images to a uniform three-channel RGB format.

  • Implemented functionality for saving the trained model to a local file, facilitating model preservation and future usage.

  • Accomplished the training of the AI model using a dataset of more than 12000+ images, yielding an accuracy of 91.23% on the testing set. This promising result highlights the model's capability in differentiating between AI-generated and human-created images. However, there is a potential concern regarding overfitting, which will be a focus for further investigation and model optimization in subsequent development phases.

2024-3-22 ~ 2024-4-1

  • To optimize data memory usage, various storage methods were evaluated. Ultimately, using the Joint Photographic Experts Group format was determined to be the most efficient.

  • Completed the development of the model loading program.

  • Explored the use of Grad-CAM (Gradient-weighted Class Activation Mapping), incorporating a feature to generate heatmaps that illustrate contributing factors.

2024-4-1 ~ 2024-4-13

  • Developed a React application on Replit to construct the frontend.

  • Completed the design of web pages and their components.

  • Implemented API endpoints using Flask.

  • Programmed the frontend to enable connectivity with the backend.

  • Integrated an "ArtifAI" consultant feature using the OpenAI GPT API.

  • Deployed the application on the server.

  • Project completion achieved.

Challenges Encountered

  1. Encountered constraints in harvesting image data from ArtStation in adherence to robots.txt protocols.

    Solution | Employed the Selenium framework to emulate authentic user interactions, thereby ensuring compliance while effectively extracting data.

  2. Faced issues with inconsistent shapes of image data, leading to errors during the model training process.

    Solution | Standardized all images to have a uniform shape of (256,256,3). This involved converting grayscale images into three-channel RGB format, ensuring consistency across the dataset and smooth model training.

  3. Exceeded memory limits during data processing and model training.

    Solution | Addressed this issue by implementing data batch loading and reducing batch sizes, thereby optimizing memory usage and ensuring efficient processing.

  4. Successfully trained the AI model on a dataset of over 6000 images, but observed a potential overfitting issue, as indicated by a 71.23% accuracy rate on the testing set. This suggests a need for careful evaluation and refinement of the model to ensure it generalizes well to unseen data.

About

ArtifAI is designed to detect the potential origins of an artwork, whether generated by AI or created by human. It is dedicated to addressing copyright brought by generative artificial intelligence. It also provides gpt-based consultant for querying any copyright issue.

License:MIT License


Languages

Language:Python 100.0%