Edward-Jing / STA-221

Few-shot Image Classification for Breast Cancer Detection

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STA221

Few-shot Image Classification for Breast Cancer Detection

Background Inforamtion

Breast cancer is a pressing concern worldwide, ranking as the second leading cause of cancer death in women. Fortunately, early detection and identification of breast cancer can lead to timely treatment, effectively reducing the risk of further deterioration or death. Breast ultrasound image classification serves as a primary method for such detection. However, traditional medical image classification often demands considerable human expertise and time, making it impractical for many underdeveloped countries and regions. Hence, there has been a shift towards pattern recognition and machine learning approaches to automate and streamline medical image classification and breast cancer detection. Numerous learning-based methods, ranging from traditional machine learning to modern deep learning, have been proposed.

Dataset

The dataset we use for our classification task includes breast ultrasound images collected from 600 female patients aged between 25 and 75 years old. The dataset consists of 780 black-and-white images, each of which containing 500*500 pixels, and is categorized into three classes: normal, benign, and malignant. One of the most intriguing features of this dataset is the exceptionally small size. With a 80%/20% training/testing split, only about 600 images can be used to train a model for three-class classification. Such a characteristic not only vividly simulates the real-world scenario of medical data sparsity but also poses a great challenge to the efficient training of machine learning models. Our dataset is from https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset?rvi=1.

Methods

By Xiaowei zeng, Pengyu Chen and Jingzhi Sun

  • Logistic Regression
  • Naïve Bayes
  • Support Vector Machine
  • Decision Tree and Random Forest

By Qinrun Dai

Sturcture of Files

  • nb: Naïve Bayes
  • rf: Random Forest
  • Rawdata: data without mask
  • new: data with masked

About

Few-shot Image Classification for Breast Cancer Detection

License:MIT License


Languages

Language:Python 100.0%