frank-peng's starred repositories
LiveTalking
Real time interactive streaming digital human
transformers_tasks
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
Awesome-Chinese-LLM
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
MedicalGPT
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
DeepSpeedExamples
Example models using DeepSpeed
Source-Code-Notebook
关于一些经典论文源码的逐行中文笔记
llm-cookbook
面向开发者的 LLM 入门教程,吴恩达大模型系列课程中文版
DirectML
DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.
freeCodeCamp
freeCodeCamp.org's open-source codebase and curriculum. Learn to code for free.
generative-recommenders
Repository hosting code used to reproduce results in "Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations" (https://arxiv.org/abs/2402.17152).
cs-self-learning
计算机自学指南
FreeAskInternet
FreeAskInternet is a completely free, PRIVATE and LOCALLY running search aggregator & answer generate using MULTI LLMs, without GPU needed. The user can ask a question and the system will make a multi engine search and combine the search result to LLM and generate the answer based on search results. It's all FREE to use.
onerec
在常规推荐系统算法和系统双优化的范式下,一线公司针对单个任务或单个业务的效果挖掘几乎达到极限。从2019年我们开始关注多种信息的萃取融合,提出了OneRec算法,希望通过平台或外部各种各样的信息来进行知识集成,打破数据孤岛,极大扩充推荐的“Extra World Knowledge”。 已实践的算法包括行为数据,内容描述,社交信息,知识图谱等。在OneRec,每种信息和整体算法的集成是可插拔的,这样的话一方面方便大家在自己的平台数据下灵活组合各种信息,另一方面方便开源共建,大家可以在上边集成自己的各种算法。今天分享的都是之前在线上验证过效果的工作,相关代码和论文已经开源在:。欢迎大家开源共建。