JimmyJiang666

In this exercise, we will use k-means clustering to compress an image. When an image is stored in the PNG format, for each pixel, the red, blue and green components are stored as 8 bit numbers each. Thus, we need 24 bits per pixel 4 . The idea for compressing is the following. We think of each pixel with components (r, g, b) as a point in three dimensions. Thus there are as many points as there are pixels in the image. Then, using k-means clustering, we compute k cluster centers. Each cluster center is a point in three dimensions and thus represents a color. If a point p belongs to a cluster with center c, we will change the color of the pixel corresponding to p to the color represented by c. We expect the error caused by this change to be small. Let us take k = 64 so that each cluster number is encoded by 6 bits. Thus, we will only use 64 distinct colors corresponding to the 64 cluster centers. Each pixel can now store 6 bits of information indicating which of the 64 colors it uses. Thus, apart from the overhead of storing the 64 colors using 64 × 3 bytes, we only use 6 bits per pixel, leading to a reduction in space by a factor of about 4.

Language:Jupyter Notebook000

Metal-Slug-via-Processing

In the course of Intro to CS, I replicated a video game "Metal Slug" via Processing.

Language:Python000

ML_Notebooks

A repository for public Machine Learning notebooks I have created

000

MOPTA2024

Code backup for MOPTA 2024 competition

Language:Jupyter Notebook000

Movie-Analysis-via-Pyspark

Project assignment from the course "Algorithmic Foundation of Data Science"

Language:Python000

Principal-Component-Analysis-for-Face-Recognition

Using PCA technique to train and test a model to handle face recognition tasks.

Language:Python000

Project-in-Data-Structure-Class

Hotel finder application created for Data Structure course project.

Language:C++000

Sequence-Pattern-Learning-with-RNN

Language:Jupyter Notebook000

Similarity-Search-Using-LSH

This is an assignment from NYU Classes. Download the file data.zip from the ‘Resources’ section of NYU classes and unzip it into a text file. Each of the n = 106 line in the text file represents a set of about 100 numbers. Your task is to find at least three pairs of sets with Jaccard Similarity more than 0.85. There are at least five such pairs of sets.

Language:Python000

Test-Efficiency-of-Flajolet-Martin-FM-algorithm

The goal of this exercise is to test the efficacy of the Flajolet-Martin (FM) algorithm and its variants in estimating the number of distinct elements. We will first count distinct words and shingles in the text file http://norvig.com/big.txt and then test it on a larger stream of numbers generated according to the power law distribution.

Language:Python000

Test1

This is the first test.

000

Word-Cloud-of-My-Poems

This is to create a word cloud figure for my own poems written from 2016-2020. The language processing is done using Jieba library in Python.

Language:Python000

JimmyJiang666

Jimmy Jiang's repositories

Document-Classification-for-COVID-19-Literature

AuntiesAssemble

char-rnn-cn

Classifiers-Performance-Test

CNN-image-classification-wit-Cifar-10

DocMetrics

Face-Detection-Using-Viola-Jones-Algo

Ham-Spam-Naive-Bayesian-Classifier

INDENG-262-HW-5-Code

INDENG-262-HW-6

jimmyjiang666.github.io

K-means-Clustering-Image-Compression

Metal-Slug-via-Processing

ML_Notebooks

MOPTA2024

Movie-Analysis-via-Pyspark

Principal-Component-Analysis-for-Face-Recognition

Project-in-Data-Structure-Class

Sequence-Pattern-Learning-with-RNN

Similarity-Search-Using-LSH

Test-Efficiency-of-Flajolet-Martin-FM-algorithm

Test1

Word-Cloud-of-My-Poems