Jimmy Jiang's repositories
AuntiesAssemble
SWE'20 Project
char-rnn-cn
基于char-rnn和tensorflow生成周杰伦歌词
Classifiers-Performance-Test
Logistic Regression • Random Forest Classifier • Support Vector Classifier (example). • A soft voting classifier that combines the above three.
DocMetrics
A pipeline for comparing similarities of two documents with different metrics.
Ham-Spam-Naive-Bayesian-Classifier
The goal of this exercise is to create a basic spam detector using Naive Bayes classifier
INDENG-262-HW-6
Coding for LP textbook exercise 10.4
jimmyjiang666.github.io
A beautiful, simple, clean, and responsive Jekyll theme for academics
K-means-Clustering-Image-Compression
In this exercise, we will use k-means clustering to compress an image. When an image is stored in the PNG format, for each pixel, the red, blue and green components are stored as 8 bit numbers each. Thus, we need 24 bits per pixel 4 . The idea for compressing is the following. We think of each pixel with components (r, g, b) as a point in three dimensions. Thus there are as many points as there are pixels in the image. Then, using k-means clustering, we compute k cluster centers. Each cluster center is a point in three dimensions and thus represents a color. If a point p belongs to a cluster with center c, we will change the color of the pixel corresponding to p to the color represented by c. We expect the error caused by this change to be small. Let us take k = 64 so that each cluster number is encoded by 6 bits. Thus, we will only use 64 distinct colors corresponding to the 64 cluster centers. Each pixel can now store 6 bits of information indicating which of the 64 colors it uses. Thus, apart from the overhead of storing the 64 colors using 64 × 3 bytes, we only use 6 bits per pixel, leading to a reduction in space by a factor of about 4.
Metal-Slug-via-Processing
In the course of Intro to CS, I replicated a video game "Metal Slug" via Processing.
ML_Notebooks
A repository for public Machine Learning notebooks I have created
MOPTA2024
Code backup for MOPTA 2024 competition
Movie-Analysis-via-Pyspark
Project assignment from the course "Algorithmic Foundation of Data Science"
Principal-Component-Analysis-for-Face-Recognition
Using PCA technique to train and test a model to handle face recognition tasks.
Project-in-Data-Structure-Class
Hotel finder application created for Data Structure course project.
Similarity-Search-Using-LSH
This is an assignment from NYU Classes. Download the file data.zip from the ‘Resources’ section of NYU classes and unzip it into a text file. Each of the n = 106 line in the text file represents a set of about 100 numbers. Your task is to find at least three pairs of sets with Jaccard Similarity more than 0.85. There are at least five such pairs of sets.
Test-Efficiency-of-Flajolet-Martin-FM-algorithm
The goal of this exercise is to test the efficacy of the Flajolet-Martin (FM) algorithm and its variants in estimating the number of distinct elements. We will first count distinct words and shingles in the text file http://norvig.com/big.txt and then test it on a larger stream of numbers generated according to the power law distribution.
Test1
This is the first test.
Word-Cloud-of-My-Poems
This is to create a word cloud figure for my own poems written from 2016-2020. The language processing is done using Jieba library in Python.