Kokyou (wellido)

wellido

Geek Repo

Location:homeless

Github PK Tool:Github PK Tool

Kokyou's starred repositories

OpenHands

🙌 OpenHands: Code Less, Make More

Language:PythonLicense:MITStargazers:32818Issues:291Issues:1421

ultralytics

Ultralytics YOLO11 🚀

Language:PythonLicense:AGPL-3.0Stargazers:29881Issues:165Issues:8965

Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

ai-for-grant-writing

A curated list of resources for using LLMs to develop more competitive grant applications.

Language:PythonLicense:CC-BY-4.0Stargazers:2120Issues:18Issues:0

OpenOOD

Benchmarking Generalized Out-of-Distribution Detection

Language:PythonLicense:MITStargazers:843Issues:8Issues:107

Agentless

Agentless🐱: an agentless approach to automatically solve software development problems

Language:PythonLicense:MITStargazers:682Issues:9Issues:21

OpenAttack

An Open-Source Package for Textual Adversarial Attack.

Language:PythonLicense:MITStargazers:682Issues:18Issues:78

deita

Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]

Language:PythonLicense:Apache-2.0Stargazers:475Issues:6Issues:27

rho

Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.

Agent4SE-Paper-List

Repository for the paper "Large Language Model-Based Agents for Software Engineering: A Survey".

torch-model-compression

针对pytorch模型的自动化模型结构分析和修改工具集,包含自动分析模型结构的模型压缩算法库

Language:PythonLicense:MITStargazers:236Issues:12Issues:20

LLM-Uncertainty-Bench

Benchmarking LLMs via Uncertainty Quantification

Language:PythonLicense:MITStargazers:210Issues:3Issues:1

bigcodebench

BigCodeBench: Benchmarking Code Generation Towards AGI

Language:PythonLicense:Apache-2.0Stargazers:197Issues:5Issues:35

ShortcutsBench

ShortcutsBench: A Large-Scale Real-World Benchmark for API-Based Agents

Language:PythonLicense:Apache-2.0Stargazers:72Issues:1Issues:1

UoT

[NeurIPS 2024] Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models

R-Judge

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)

active-learning

Continuous Learning for Android Malware Detection (USENIX Security 2023)

S-Eval

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

tnpa-generalizability

IST'21 & SANER'22: Semantic-Preserving Program Transformations

Language:JavaLicense:MITStargazers:31Issues:4Issues:0

apbench

APBench: A Unified Availability Poisoning Attack and Defenses Benchmark (TMLR 08/2024)

Language:PythonLicense:MITStargazers:26Issues:2Issues:3
Language:PythonLicense:MITStargazers:25Issues:8Issues:1

selforacle

The code of our paper "Misbehaviour Prediction for Autonomous Driving Systems", including our improved Udacity simulator

Language:PythonLicense:MITStargazers:21Issues:8Issues:5

autoeval_baselines

This repository includes various baseline techniques for label-free model evaluation task for the VDU2023 competition.

Language:PythonLicense:MITStargazers:19Issues:2Issues:1
Language:PythonLicense:Apache-2.0Stargazers:8Issues:7Issues:0
Language:PythonStargazers:7Issues:1Issues:0

Poisoning-Attack-on-Code-Completion-Models

Paper "An LLM-Assisted Easy-to-Trigger Poisoning Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection"

Language:PythonStargazers:6Issues:1Issues:0

misbehaviour-prediction-with-uncertainty-quantification

Codebase of the MSc thesis by Ruben Grewal "Uncertainty Quantification for Failure Prediction in Autonomous Driving Systems" and replication package of the paper "Predicting Safety Misbehaviours in Autonomous Driving Systems using Uncertainty Quantification" (ICST 2024).

Language:Jupyter NotebookLicense:MITStargazers:1Issues:0Issues:0