Abel Senter's starred repositories

blogs

Jupyter notebooks that support my graph data science blog posts at https://bratanic-tomaz.medium.com/

Language:Jupyter NotebookStargazers:1083Issues:0Issues:0

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system

Language:PythonLicense:MITStargazers:13992Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10346Issues:0Issues:0

DocRes

[CVPR 2024] DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Language:PythonLicense:MITStargazers:239Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:13403Issues:0Issues:0

lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Language:PythonLicense:Apache-2.0Stargazers:3713Issues:0Issues:0

vllm-gptq

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:117Issues:0Issues:0

underthesea

Underthesea - Vietnamese NLP Toolkit

Language:PythonLicense:GPL-3.0Stargazers:1356Issues:0Issues:0

Awesome-LLMs-Datasets

Summarize existing representative LLMs text datasets.

License:Apache-2.0Stargazers:761Issues:0Issues:0

candle

Minimalist ML framework for Rust

Language:RustLicense:Apache-2.0Stargazers:14797Issues:0Issues:0

natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

Language:PythonLicense:MITStargazers:1180Issues:0Issues:0

yuqing

思通舆情 是一款开源免费的舆情系统,支持本地化部署。支持对海量的舆情数据进行多维交叉分析和深度挖掘,为用户户提供全面的舆情数据,专业的舆情分析。

Language:JavaScriptLicense:GPL-3.0Stargazers:364Issues:0Issues:0

MarkTool

DoTAT 是一款基于web、面向领域的通用文本标注工具,支持大规模实体标注、关系标注、事件标注、文本分类、基于字典匹配和正则匹配的自动标注以及用于实现归一化的标准名标注,同时也支持迭代标注、嵌套实体标注和嵌套事件标注。标注规范可自定义且同类型任务中可“一次创建多次复用”。通过分级实体集合扩大了实体类型的规模,并设计了全新高效的标注方式,提升了用户体验和标注效率。此外,本工具增加了审核环节,可对多人的标注结果进行一致性检验、自动合并和手动调整,提高了标注结果的准确率。

Language:VueLicense:Apache-2.0Stargazers:580Issues:0Issues:0

Masquerade-23

A LLMs-driven social bots dataset collected from Chirper.ai

License:Apache-2.0Stargazers:9Issues:0Issues:0

ChatLM-mini-Chinese

中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调,给出三元组信息抽取微调示例。

Language:PythonLicense:Apache-2.0Stargazers:1035Issues:0Issues:0

OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.

Language:PythonLicense:Apache-2.0Stargazers:9507Issues:0Issues:0

HelloGitHub

:octocat: 分享 GitHub 上有趣、入门级的开源项目。Share interesting, entry-level open source projects on GitHub.

Language:PythonStargazers:88867Issues:0Issues:0

bert-textcnn-for-multi-label-text-classfication

利用bert和textcnn解决多标签文本分类的demo。

Language:PythonLicense:Apache-2.0Stargazers:27Issues:0Issues:0

pytextclassifier

pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。

Language:PythonLicense:Apache-2.0Stargazers:473Issues:0Issues:0

pytorch_bert_multi_classification

基于pytorch_bert的中文多标签分类

Language:PythonStargazers:75Issues:0Issues:0

FlagEmbedding

Retrieval and Retrieval-augmented LLMs

Language:PythonLicense:MITStargazers:6335Issues:0Issues:0

FreeAI

OpenAI should not be a closed AI. FreeAI持续向更好用、更强大、更便宜的AI开放而努力,可为一般的科研组省下一笔的不易报销的经费支出。

Language:PythonStargazers:190Issues:0Issues:0

gpt_academic

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, moss等。

Language:PythonLicense:GPL-3.0Stargazers:62975Issues:0Issues:0

uniem

unified embedding model

Language:PythonLicense:Apache-2.0Stargazers:801Issues:0Issues:0

awesome_LLMs_interview_notes

LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案

License:MITStargazers:1103Issues:0Issues:0

SEESAW

Code, data, and models for "Generative Entity-to-Entity Stance Detection with Knowledge Graph Augmentation"

Language:PythonLicense:NOASSERTIONStargazers:9Issues:0Issues:0

directed_sentiment_analysis

Dataset and code for directed sentiment analysis in news text.

Language:PythonStargazers:17Issues:0Issues:0

POLITICS

Code, data, and models for "POLITICS: Pretraining with Same-story Article Comparison for Ideology Prediction and Stance Detection"

Language:PythonLicense:NOASSERTIONStargazers:28Issues:0Issues:0

ChatGLM3

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型

Language:PythonLicense:Apache-2.0Stargazers:13212Issues:0Issues:0

Aspect-Based-Sentiment-Analysis

A paper list for aspect based sentiment analysis.

License:MITStargazers:449Issues:0Issues:0