D0z1ngShark's repositories
algorithms
Minimal examples of data structures and algorithms in Python
baby-llama2-chinese
用于从头预训练+SFT一个小参数量的中文LLaMa2的仓库;24G单卡即可运行得到一个具备简单中文问答能力的chat-llama2.
ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。
Chinese-LLaMA-Alpaca
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
chinese-poetry
A dataset under construction of Chinese Poetry 中文古文诗词数据集,,近1,4000诗人, 107,891唐诗,275,581宋词。
Classical-Modern
非常全的文言文(古文)-现代文平行语料
Coloring-t-SNE
Exploration of methods for coloring t-SNE.
DeepLearning_LHY21_Notes
深度学习 李宏毅 2021 学习笔记
fanqiang
翻墙-科学上网
ML2021-Spring
李宏毅 (Hung-Yi Lee) 機器學習 Machine Learning 2021 Spring
movenet
Google's Next Gen Pose Estimation in PyTorch
fucking-algorithm
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
lit-gpt
Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
nlp-tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
openbilibili-go-common
🙈!🙉!🙊!我不清楚这些是啥… 想谈道德的请把出门右转996.icu!
phx
9Android 客户端,突破游客每天观看10次视频的限制,还可以下载视频
Prompt-Engineering-Guide
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc. 天工系列模型在3.2TB高质量多语言和代码数据上进行预训练。我们开源了模型参数,训练数据,评估数据,评估方法。
SparkInternals
Notes talking about the design and implementation of Apache Spark
the-algorithm
Source code for Twitter's Recommendation Algorithm
the-algorithm-ml
Source code for Twitter's Recommendation Algorithm
TinyLlama
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
weibo_terminater
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anythings. The Terminator
XgboostAndLR
use xgboost and lr model for text classification. xgboost is used to be a feature transform for LR