Zhecheng Li's repositories
RAG-ChatBot
A basic application using langchain, streamlit, and large language models to build a system for Retrieval-Augmented Generation (RAG) based on documents, also includes how to use Groq and deploy your own applications.
Kaggle-PII_Data_Detection
Implement named entity recognition (NER) using regex and fine-tuned LLM, with a total of 15 categories. The ultimate goal is to apply the model to detect personally identifiable information (PII) in student writing.
Kaggle-Automated_Essay_Scoring_2.0
(1) Train large language models to help people with automatic essay scoring. (2) Extract essay features and train new tokenizer to build tree models for score prediction.
Kaggle-Detect_Sleep_States
Predicting changes in sleep states based on sleep monitoring data. (Mainly PrecTime model)
Kaggle-LLM-Detect_AI_Generated_Text
Detect whether the text is AI-generated by training a new tokenizer and combining it with tree classification models or by training language models on a large dataset of human & AI-generated texts.
Kaggle-LLM_Science_Exam
Implementing science-related multiple-choice question answering based on LLMs and RAG.
Custom-ChatGPT
Using the question-answer dataset on Hugging Face to fine-tune ChatGPT and compare the fine-tuned model with original ChatGPT.
Kaggle-CIBMTR
In this competition, you’ll develop models to improve the prediction of transplant survival rates for patients undergoing allogeneic Hematopoietic Cell Transplantation (HCT) — an important step in ensuring that every patient has a fair chance at a successful outcome, regardless of their background.
Kaggle-Eedi
Develop an nlp-based method to predict the affinity between misconceptions and incorrect answers (distractors) in multiple-choice questions.
Kaggle-LMSYS
Analyze a dataset of conversations from the Chatbot Arena, where various LLMs provide responses to user prompts. The goal is to develop a model that enhances chatbot interactions, ensuring they align more closely with human preferences.
Kaggle-Multilingual_Chatbot_Arena
This competition challenges you to predict which responses users will prefer in a head-to-head battle between chatbots powered by large language models (LLMs).
MultiModal
Basic implementation code for multimodal models and some applications or fine-tuning tasks based on them.
GUI-Python
Simple Python frontend mini-program, mainly including the use of libraries such as Streamlit, etc. Help understand how to use various APIs.
Kaggle-CMI-Detect_Sleep_States
The goal of this competition is to detect sleep onset and wake. You will develop a model trained on wrist-worn accelerometer data in order to determine a person's sleep state.
Kaggle-Linking_Writing_Processes_to_Writing_Quality
Predicting writing quality based on data statistics of the writing process. The key lies in feature engineering and tree models.
Transformer-Compilation
Implementation of various transformer architecture models, applications, and fine-tuning codes.
Kaggle-Dataset-API-Upload
How to use the Kaggle API to upload data from a server to Kaggle as a dataset?
Kaggle-LLM_Prompt_Recovery
LLMs are commonly used to rewrite or make stylistic changes to text. The goal is to recover the LLM prompt that was used to transform a given text.
Kaggle-The_Polytope_Permutation_Puzzle
Using reinforcement learning and recursive methods to solve three types of puzzles.
Lizhecheng02
Zhecheng Li GitHub Profile
Reinforcement-Learning
Basic code for reinforcement learning and small programs.
UCSD-CSE256
CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24]
UCSD-CSE256-PA1
CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA1
UCSD-CSE256-PA2
CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA2
UCSD-CSE256-PA3
CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA3
UCSD-CSE256-PA4
CSE 256 LIGN 256 - Statistical Natural Lang Proc - Nakashole [FA24] PA4
UCSD-CSE257-2048
Implement a game AI for the 2048 game based on expectimax search.
UCSD-CSE291J
UCSD CSE 291J: Fairness, Bias, and Transparency in Machine Learning (Winter 2025)