Dongjun Lee (DongjunLee)

DongjunLee

Geek Repo

Company:@lbox-kr

Location:South Korea

Home Page:https://dongjunlee.github.io/

Github PK Tool:Github PK Tool


Organizations
hb-research
KLUE-benchmark
lbox-kr
naver

Dongjun Lee's starred repositories

papers-we-love

Papers from the computer science community to read and discuss.

polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Language:RustLicense:NOASSERTIONStargazers:28209Issues:161Issues:8136

pulumi

Pulumi - Infrastructure as Code in any programming language ๐Ÿš€

Language:GoLicense:Apache-2.0Stargazers:20594Issues:195Issues:7253

pyscript

Try PyScript: https://pyscript.com Examples: https://tinyurl.com/pyscript-examples Community: https://discord.gg/HxvBtukrg2

Language:PythonLicense:Apache-2.0Stargazers:17568Issues:170Issues:779

presto

The official home of the Presto distributed SQL query engine for big data

Language:JavaLicense:Apache-2.0Stargazers:15789Issues:862Issues:6479

chaosmonkey

Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.

Language:GoLicense:Apache-2.0Stargazers:14825Issues:670Issues:23

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Language:JavaLicense:Apache-2.0Stargazers:9955Issues:174Issues:6432

datahub

The Metadata Platform for your Data Stack

Language:JavaLicense:Apache-2.0Stargazers:9495Issues:254Issues:2089

xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Language:PythonLicense:NOASSERTIONStargazers:8087Issues:79Issues:499

gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Language:PythonLicense:Apache-2.0Stargazers:6726Issues:122Issues:433

obsidian-dataview

A data index and query language over Markdown files, for https://obsidian.md/.

Language:TypeScriptLicense:MITStargazers:6635Issues:41Issues:1309

SPTAG

A distributed approximate nearest neighborhood search (ANN) library which provides a high quality vector index build, search and distributed online serving toolkits for large scale vector search scenario.

nvitop

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Language:PythonLicense:Apache-2.0Stargazers:4334Issues:25Issues:82

quarto-cli

Open-source scientific and technical publishing system built on Pandoc.

Language:JavaScriptLicense:NOASSERTIONStargazers:3611Issues:27Issues:4669

BIG-bench

Beyond the Imitation Game collaborative benchmark for measuring and extrapolating the capabilities of language models

Language:PythonLicense:Apache-2.0Stargazers:2773Issues:50Issues:149

transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for ๐Ÿค— Hugging Face transformer models ๐Ÿš€

Language:PythonLicense:Apache-2.0Stargazers:1638Issues:27Issues:121

pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Language:PythonLicense:Apache-2.0Stargazers:1567Issues:19Issues:533

dev-conf-replay

๐Ÿ€ ์ตœ๊ทผ ๊ตญ๋‚ด IT ์„ธ๋ฏธ๋‚˜ ๋ฐ ๊ฐœ๋ฐœ์ž๐Ÿ’ป ์ปจํผ๋Ÿฐ์Šค ์˜์ƒ์˜ ๋‹ค์‹œ ๋ณด๊ธฐ๐Ÿ‘€ ๋งํฌ๋ฅผ ํ•œ๊ณณ์— ์ •๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค!

splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)

Language:PythonLicense:NOASSERTIONStargazers:714Issues:20Issues:50

tevatron

Tevatron - A flexible toolkit for neural retrieval research and development.

Language:PythonLicense:Apache-2.0Stargazers:443Issues:10Issues:87

DeepCT

DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.

Language:PythonLicense:BSD-3-ClauseStargazers:310Issues:8Issues:19

DiffCSE

Code for the NAACL 2022 long paper "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings"

Language:PythonLicense:MITStargazers:288Issues:4Issues:21

TLM

ICML'2022: NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Language:PythonLicense:MITStargazers:255Issues:5Issues:19

CLIP-Caption-Reward

PyTorch code for "Fine-grained Image Captioning with CLIP Reward" (Findings of NAACL 2022)

Language:PythonLicense:NOASSERTIONStargazers:230Issues:5Issues:12

COIL

NAACL2021 - COIL Contextualized Lexical Retriever

Language:PythonLicense:Apache-2.0Stargazers:142Issues:2Issues:21
Language:PythonLicense:NOASSERTIONStargazers:83Issues:6Issues:14

elasticsearch-jaso-analyzer

Korean Jaso Analyzer for Elasticsearch

Language:JavaLicense:MITStargazers:75Issues:7Issues:12

carecall-corpus

CareCall for Seniors: Role Specified Open-Domain Dialogue dataset generated by leveraging LLMs (NAACL 2022).

License:NOASSERTIONStargazers:59Issues:3Issues:0

betterkoreankotlin

์ฝ”ํ‹€๋ฆฐ ํ•œ๊ธ€ ์กฐ์‚ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ˆ์š” (์€๋Š”์ด๊ฐ€)

Language:KotlinLicense:Apache-2.0Stargazers:53Issues:8Issues:0