wellido

followers

following

stars

homeless

Kokyou's starred repositories

OpenHands

🙌 OpenHands: Code Less, Make More

Language:PythonMIT32818 291 1421

ultralytics

Ultralytics YOLO11 🚀

Language:PythonAGPL-3.029881 165 8965

Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

Apache-2.04173 67 6

ai-for-grant-writing

A curated list of resources for using LLMs to develop more competitive grant applications.

Language:PythonCC-BY-4.02120 180

OpenOOD

Benchmarking Generalized Out-of-Distribution Detection

Language:PythonMIT843 8 107

Agentless

Agentless🐱: an agentless approach to automatically solve software development problems

Language:PythonMIT682 9 21

OpenAttack

An Open-Source Package for Textual Adversarial Attack.

Language:PythonMIT682 18 78

deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Language:PythonApache-2.0475 6 27

rho

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

MIT296 5 5

Agent4SE-Paper-List

Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey".

torch-model-compression

针对pytorch模型的自动化模型结构分析和修改工具集，包含自动分析模型结构的模型压缩算法库

Language:PythonMIT236 12 20

LLM-Uncertainty-Bench

Benchmarking LLMs via Uncertainty Quantification

Language:PythonMIT210 3 1

bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI

Language:PythonApache-2.0197 5 35

ShortcutsBench

ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents

Language:PythonApache-2.072 1 1

ChatUniTest

GPL-3.069 3 5

UoT

[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

Language:Python64 3 3

R-Judge

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)

Language:Python58 3 3

active-learning

Continuous Learning for Android Malware Detection (USENIX Security 2023)

Language:Python57 3 2

S-Eval

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

NOASSERTION34 4 3

tnpa-generalizability

IST'21 & SANER'22: Semantic-Preserving Program Transformations

Language:JavaMIT31 40

apbench

APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)

Language:PythonMIT26 2 3

SafeCoder

Language:PythonMIT25 8 1

selforacle

The code of our paper "Misbehaviour Prediction for Autonomous Driving Systems", including our improved Udacity simulator

Language:PythonMIT21 8 5

McEval

Language:Python20 2 1

autoeval_baselines

This repository includes various baseline techniques for label-free model evaluation task for the VDU2023 competition.

Language:PythonMIT19 2 1

llm-quantization-attack

Language:PythonMIT10 5 1

icml2024-roundtrip-correctness

Language:PythonApache-2.08 70

ICSE22_SC_Data

Language:Python7 10

Poisoning-Attack-on-Code-Completion-Models

Paper "An LLM-Assisted Easy-to-Trigger Poisoning Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection"

Language:Python6 10

misbehaviour-prediction-with-uncertainty-quantification

Codebase of the MSc thesis by Ruben Grewal "Uncertainty Quantification for Failure Prediction in Autonomous Driving Systems" and replication package of the paper "Predicting Safety Misbehaviours in Autonomous Driving Systems using Uncertainty Quantification" (ICST 2024).

Language:Jupyter NotebookMIT100