Teo-KJ / LabelDat

Group 3 - Chua Chiah Soon, Wilson Thurman Teng, Goh Shing Ling, Choong Han Yi, Alfredo Ryelcius, Teo Kai Jie, Goh Jun Le

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LabelDat

image

LabelDat is a webapp designed as a one-stop portal to manage, organise and facilitate the labelling of datasets. The project is done as part of a school project on Advanced Software Engineering.

Contributors

This project is jointly developed by the following members.

About LabelDat

In recent years, we see a boom in data which allowed for the increase use in machine learning and artificial intelligence. The large amounts of data would mean a lot of time is needed to clean and process the data before applying machine learning techniques. In addition, low quality data can arise due to poor cleaning and processing of the data. This can cause problems as machine learning models struggle to learn, thus further delaying the project as compared to when higher quality data is in use.

As such, we develop LabelDat, the dataset labelling webapp for 2 groups of users - project owners and labellers. LabelDat allows project owners to upload an unlabelled dataset on the portal, so that other project owners and labellers can label the data on the platform. The webapp includes features to enhance the labelling process through the interface and use of machine learning to predict the labels. After labelling, the datasets can then be downloaded for further usage. The features are elaborated below.

App Features

The following are some features of the application.

Create labelling project and task

Project owners can create new project and outline instructions to perform labelling task.

image

Dashboard

Track progress with a dashboard to summarise the labelling progress of each project.

image image

Machine Learning assisted labelling

Speed up labelling task by using machine learning to suggest the labels.

image

User Profile and Leaderboard

Shows individual users' profile, both labellers and project owners, and compare their labelling performance with other users through a leaderboard.

image

Export labelled dataset

An export function is available for the project owner to export and download the labelled dataset.

image


App Development

Frontend: React.js

Setup instructions

  1. Install dependencies for frontend
    cd frontend
    npm install
    
  2. Run development server
    npm start
    

Backend: Flask

Setup instructions

In Progress

  1. Setup virtual environment, activate and install necessary packages

    For Windows :

    cd backend
    python3 -m venv venv
    venv\Scripts\activate
    pip3 install -r requirements.txt
    

    For Mac :

    cd backend
    python3 -m venv venv
    source venv/bin/activate
    cd ..
    pip3 install -r requirements.txt
    
  2. If you install other packages, please add them to requirements.txt

    cd backend
    pip3 freeze > requirements.txt
    
  3. Start the Flask server

    cd backend
    python3 main.py
    

API Endpoints

Our API endpoints are documented in this link

Database: MySQL

Setup instructions

For Windows :

  1. Go to SQL's Windows MSI Installer Download Page and follow setup instructions (Default port should be 3306)
  2. Create a database for our data with the following commands on MySQL Shell.
    \sql
    \connect root@localhost
    
  3. Type in password for root.
  4. Continue with the command below on MySQL Shell.
    create schema ase;
    

For Mac :

  1. Download mysql 8.0.x using Homebrew, then start MySQL server
    brew install mysql@8.0
    brew services start mysql
    
  2. Create a database for our data.
    mysql -u root < "create schema ase;"
    
  3. Change password for root user. Run mysql -u root first to get into mysql console. Then run this to change password to toor (standardised)
    ALTER USER 'root'@'localhost' IDENTIFIED BY 'toor';
    

About

Group 3 - Chua Chiah Soon, Wilson Thurman Teng, Goh Shing Ling, Choong Han Yi, Alfredo Ryelcius, Teo Kai Jie, Goh Jun Le


Languages

Language:JavaScript 55.9%Language:Python 36.0%Language:SCSS 6.7%Language:HTML 1.4%