reasoning-models

There are 3 repositories under reasoning-models topic.

zilliztech / deep-searcher
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
agent agentic-rag claude deep-research deepseek deepseek-r1 grok grok3 llama4 llm milvus openai qwen3 rag reasoning-models vector-database zilliz
Language:Python 7119
MiniMax-AI / MiniMax-M1
MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model.
large-language-models llm minimax-m1 reasoning-models
Language:Python 2973
Zefan-Cai / R-KV
[Neurips 2025] R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
kvcache llm reasoning-models
Language:Python 1145
HKUDS / LightReasoner
"LightReasoner: Can Small Language Models Teach Large Language Models Reasoning?"
large-language-models post-training reasoning-models token-efficiency
Language:Python 456
LG-AI-EXAONE / EXAONE-Deep
Official repository for EXAONE Deep built by LG AI Research
language-model llm model reasoning-models
402
eric-ai-lab / Soft-Thinking
Official implementation of the NeurIPS 2025 paper "Soft Thinking: Unlocking the Reasoning Potential of LLMs in Continuous Concept Space"
chain-of-thought-reasoning reasoning-models soft-reasoning concept-token continous-space-reasoning soft-thinking soft-token
Language:Python 272
UCSC-VLAA / MedReason
MedReason: Eliciting Factual Medical Reasoning Steps in LLMs via Knowledge Graphs
medical-dataset medical-large-language-models reasoning reasoning-models
Language:Python 232
hao-ai-lab / Dynasor
[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.
deepseek-r1 llm reasoning-models
Language:Python 202
Alpha-Innovator / OmniCaptioner
Official Repository of OmniCaptioner
caption-generation captioning-images deepseek-r1 multi-modal reasoning-models vlms multi-modal-deepseek-r1
Language:Python 164
IAAR-Shanghai / xVerify
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations
judge-model llm-as-a-judge xverify benchmark evaluation regex reliability reliability-tools math-verify deepseek-math open-compass cc-by-nc-nd-4 chatgpt llm open-r1 reasoning-models
Language:Python 137
codelion / pts
Pivotal Token Search
dataset-generation direct-preference-optimization dpo llm llm-inference llm-steering phi4 sparse-autoencoder steering-vector tokens phi-4 phi-4-mini phi4-mini reasoning-agent reasoning-language-models reasoning-models pivotal-token-search pivotal-tokens mech-interp sae
Language:Python 131
OpenSPG / KAG-Thinker
An interactive thinking and deep reasoning model. It provides a cognitive reasoning paradigm for complex multi-hop problems.
deepresearch deepsearchalgorithmus deepthinking reasoning-models kag
Language:Python 70
fscdc / ReasonMap
[arXiv 2025] Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
efficient-reasoning multimodal-large-language-models reasoning reasoning-models
Language:Python 68
czg1225 / VeriThinker
[NeurIPS 2025] VeriThinker: Learning to Verify Makes Reasoning Model Efficient
deepseek-r1-distill-llama deepseek-r1-distill-qwen efficiency fine-tuning large-language-models reasoning-models
Language:Python 61
DolbyUUU / Logic-RL-Lite
Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".
deepseek deepseek-r1 fine-tuning gpt-o1 llm post-training reasoning-language-models reasoning-models reinforcement-learning
Language:Python 48
DolbyUUU / DeepEnlighten
Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.
deepseek deepseek-r1 fine-tuning gpt-o1 llm post-training reasoning-language-models reasoning-models reinforcement-learning
Language:Python 38
PPPP-kaqiu / Awesome-Parallel-Reasoning
Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.
awesome-parallel-reasoning large-language-models r1 reasoning-models reinforcement-learning-algorithms test-time-scaling
Language:HTML 36
UKPLab / acl2025-diverse-cot
Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"
chain-of-thought cot large-reasoning-models lrm reasoning-models
Language:Python 33
microsoft / DSP-Plus
Implementation and subsequent optimization for "Reviving DSP for Advanced Theorem Proving in the Era of Reasoning Models"
lean4 mathreasoning reasoning-models theorem-proving
Language:Python 25
yongchao98 / R1-Code-Interpreter
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
code-interpreter large-language-models planning reasoning-models reinforcement-learning
Language:Python 25
AbhaySingh71 / AI-Lawyer-RAG-with-Deepseek
AI Lawyer is an intelligent reasoning legal assistant powered by DeepSeek , Ollama RAG and LangChain, designed to streamline legal research and document analysis. By leveraging retrieval-augmented generation (RAG), it provides precise legal insights, and contract summarization. With an intuitive Streamlit-based UI, analyze legal documents.
chatbot deepseek-r1 faiss-vector-database generative-ai groqapi huggingface langchain legal-analytics-and-data-science llm-agent ollama ollamaembeddings reasoning-models retrieval-augmented-generation streamlit vector-database
Language:Python 24
intellectronica / generative-learning
Using a reasoning LLM to learn a prompt from data
ai gemini-api llm ml reasoning-models
Language:Jupyter Notebook 24
CosmosYi / ReasoningShield
ReasoningShield: Safety Detection over Reasoning Traces of Large Reasoning Models
ai-safety detection llm moderation reasoning-models safety
Language:Python 22
microsoft / BUILD25-LAB333
This repository hosts the instructions and workshop materials for Lab 333 - Evaluate Reasoning Models for Your Generative AI Solutions
azure-ai-foundry model-catalog openai python reasoning-models
Language:Jupyter Notebook 18
mrorigo / agentic-deep-graph-reasoning
Agentic Deep Graph Reasoning Implementation
ai-learning entity-extraction knowledge-distillation knowledge-graph reasoning-models
Language:Python 14
sinanuozdemir / oreilly-agi
Explore the evolution of AGI through historical context, reasoning models, and agent systems, while gaining hands-on experience with cutting-edge models like Claude 4, DeepSeek-R1, and OpenAI's o3. Learn to critically evaluate AGI benchmarks, understand their limitations, and identify where current models excel or struggle in reasoning tasks.
agents agi ai-agents artifical-general-inteligence reasoning-models
Language:Jupyter Notebook 12
dialectical-framework
dialexity / dialectical-framework
Turn stories, strategies, or systems into insight. Auto-generate Dialectical Wheels (DWs) from any text to reveal blind spots, surface polarities, and trace dynamic paths toward synthesis. DWs are semantic maps that expose tension, transformation, and coherence within a system—whether narrative, ethical, organizational, or technological.
agi ai dialectic dialectics reasoning reasoning-agent reasoning-algoritm reasoning-engine reasoning-language-models reasoning-models reasoning-on-graph semantic-analysis
Language:Python 10
sshh12 / state-sandbox
State Sandbox is an experimental game for socioeconomic simulation. It uses Large Language Models (o3-mini) to simulate the world and complex policy impacts.
ai-games civilization o1 socioeconomics nation-states reasoning-models o3-mini
Language:JavaScript 9
Asad-Shahab / sudokuLLM
LLM finetuning for Sudoku solving
grpo llm-training unsloth fine-tuning reasoning-models reinforcement-learning
Language:Python 7
mohammad-gh009 / DrugReasoner
Predicting drug approval with reasoning.
drug-discovery large-language-models llms machine-learning reasoning reasoning-language-models reasoning-models reinforcement-learning xgboost neural-network
Language:Python 7
ChanLiang / BRIDGE
Code for 'Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning'
large-reasoning-models llms reasoning-models reinforcement-learning
6
DolbyUUU / Sudoku4LLM
Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.
dataset-generator deepseek-r1 fine-tuning llm post-training reasoning-language-models reasoning-models reinforcement-learning rl-for-finance rl-for-llm
Language:Python 6
Nexus-AI
DevEvil-AI / Nexus-AI
A simple AI source code that includes chat, reasoning and image features using public APIs like xAI, OpenAI, HuggingFace and Flux.
ai ai-assistant ai-image-generation aiimagegenerator artificial-intelligence chatgpt chatgpt-api flux grok huggingface openai openai-api xai reasoning reasoning-models
Language:JavaScript 5
MicDZ / MANBench
This repo contains evaluation code for the paper "MANBench: Is Your Multimodal Model Smarter than Human?" [ACL 2025 Findings]
ai benchmark mllms reasoning-models vlm
Language:Python 3
Ruiyang-061X / SketchThinker-R1
Official code for our paper: "SketchThinker-R1: Towards Efficient Sketch-Style Reasoning in Large Multimodal Models".
efficient-reasoning large-multimodal-models multimodal reasoning reasoning-models reinforcement-learning
Language:Python 3
alexdibol / papers
Research archive: AI, reasoning architectures, and quantitative finance.
abstract-math ai ai-implementering algorithmic-trading finance investing quantum-inspired-algorithm reasoning-models risk-management
2

reasoning-models

zilliztech / deep-searcher

MiniMax-AI / MiniMax-M1

Zefan-Cai / R-KV

HKUDS / LightReasoner

LG-AI-EXAONE / EXAONE-Deep

eric-ai-lab / Soft-Thinking

UCSC-VLAA / MedReason

hao-ai-lab / Dynasor

Alpha-Innovator / OmniCaptioner

IAAR-Shanghai / xVerify

codelion / pts

OpenSPG / KAG-Thinker

fscdc / ReasonMap

czg1225 / VeriThinker

DolbyUUU / Logic-RL-Lite

DolbyUUU / DeepEnlighten

PPPP-kaqiu / Awesome-Parallel-Reasoning

UKPLab / acl2025-diverse-cot

microsoft / DSP-Plus

yongchao98 / R1-Code-Interpreter

AbhaySingh71 / AI-Lawyer-RAG-with-Deepseek

intellectronica / generative-learning

CosmosYi / ReasoningShield

microsoft / BUILD25-LAB333

mrorigo / agentic-deep-graph-reasoning

sinanuozdemir / oreilly-agi

dialexity / dialectical-framework

sshh12 / state-sandbox

Asad-Shahab / sudokuLLM

mohammad-gh009 / DrugReasoner

ChanLiang / BRIDGE

DolbyUUU / Sudoku4LLM

DevEvil-AI / Nexus-AI

MicDZ / MANBench

Ruiyang-061X / SketchThinker-R1

alexdibol / papers