Dongjun Lee (DongjunLee)

DongjunLee

Geek Repo

Company:@lbox-kr

Location:South Korea

Home Page:https://dongjunlee.github.io/

Github PK Tool:Github PK Tool


Organizations
hb-research
KLUE-benchmark
lbox-kr
naver

Dongjun Lee's starred repositories

Interview_Question_for_Beginner

:boy: :girl: Technical-Interview guidelines written for those who started studying programming. I wish you all the best. :space_invader:

sqlmodel

SQL databases in Python, designed for simplicity, compatibility, and robustness.

Language:PythonLicense:MITStargazers:13448Issues:149Issues:331

serverless-application-model

The AWS Serverless Application Model (AWS SAM) transform is a AWS CloudFormation macro that transforms SAM templates into CloudFormation templates.

Language:PythonLicense:Apache-2.0Stargazers:9275Issues:283Issues:1352

BentoML

The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!

Language:PythonLicense:Apache-2.0Stargazers:6738Issues:73Issues:1042

transitions

A lightweight, object-oriented finite state machine implementation in Python with many extensions

Language:PythonLicense:MITStargazers:5470Issues:93Issues:455

OpenPrompt

An Open-Source Framework for Prompt-Learning.

Language:PythonLicense:Apache-2.0Stargazers:4213Issues:42Issues:254

SimCSE

[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821

Language:PythonLicense:MITStargazers:3297Issues:27Issues:265

ehcache3

Ehcache 3.x line

Language:JavaLicense:Apache-2.0Stargazers:1977Issues:154Issues:1188

AWS-SAA-C02-Study-Guide

How to become a certified AWS Solutions Architect

Language:PythonLicense:Apache-2.0Stargazers:1412Issues:31Issues:75

marisa-trie

Static memory-efficient Trie-like structures for Python based on marisa-trie C++ library.

Language:CythonLicense:MITStargazers:1017Issues:26Issues:65

WeeklyArxivTalk

[Zoom & Facebook Live] Weekly AI Arxiv 시즌2

wit

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

bagua

Bagua Speeds up PyTorch

Language:PythonLicense:MITStargazers:868Issues:16Issues:145

pySBD

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

Language:PythonLicense:MITStargazers:745Issues:13Issues:74

fastformers

FastFormers - highly efficient transformer models for NLU

Language:PythonLicense:NOASSERTIONStargazers:696Issues:19Issues:18

Blackstone

:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.

Language:PythonLicense:Apache-2.0Stargazers:632Issues:39Issues:19

mistral

Mistral: A strong, northwesterly wind: Framework for transparent and accessible large-scale language model training, built with Hugging Face 🤗 Transformers.

Language:PythonLicense:Apache-2.0Stargazers:545Issues:16Issues:95

voxpopuli

A large-scale multilingual speech corpus for representation learning, semi-supervised learning and interpretation

Language:PythonLicense:NOASSERTIONStargazers:495Issues:19Issues:22

LegalPapers

Must-read Papers on Legal Intelligence

align_uniform

Open source code for paper "Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere" ICML 2020

Language:PythonLicense:MITStargazers:423Issues:13Issues:10

g-mlp-pytorch

Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Language:PythonLicense:MITStargazers:417Issues:8Issues:6

kss

KSS: Korean String processing Suite

Language:PythonLicense:BSD-3-ClauseStargazers:390Issues:4Issues:57

genalog

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Language:Jupyter NotebookLicense:MITStargazers:296Issues:12Issues:16

Condenser

EMNLP 2021 - Pre-training architectures for dense retrieval

Language:PythonLicense:Apache-2.0Stargazers:242Issues:6Issues:25

sql-snippets

A curated collection of helpful SQL queries and functions, maintained by Count.

tokenizations

Robust and Fast tokenizations alignment library for Rust and Python https://tamuhey.github.io/tokenizations/

Language:RustLicense:MITStargazers:179Issues:9Issues:11

KLUE-baseline

Finetuning Pipeline

Language:PythonLicense:Apache-2.0Stargazers:86Issues:4Issues:5