Khanh Vo Duc's repositories
arrayfire
ArrayFire: a general purpose GPU library.
cuDLA-samples
YOLOv5 on Orin DLA
Deep-Learning-Accelerator-SW
NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
isaac-sim-jetson-hil-course-doc
Doc site for Isaac Sim + Jetson HIL hands-on course
Lidar_AI_Solution
A project demonstrating Lidar related AI solutions, including three GPU accelerated Lidar/camera DL networks (PointPillars, CenterPoint, BEVFusion) and the related libs (cuPCL, 3D SparseConvolution, YUV2RGB, cuOSD,).
LLaVA
[NeurIPS 2023 Oral] Visual Instruction Tuning: LLaVA (Large Language-and-Vision Assistant) built towards GPT-4V level capabilities.
LLM-Finetuning-Hub
Repository that contains LLM fine-tuning and deployment scripts along with our research findings.
Megatron-LM
Ongoing research training transformer models at scale
mlx
MLX: An array framework for Apple silicon
NeMo-Aligner
Scalable toolkit for efficient model alignment
NeMo-Guardrails
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
neural-graphical-models
Neural Graphical models are neural network based graphical models that offer richer representation, faster inference & sampling
neuralangelo
Official implementation of "Neuralangelo: High-Fidelity Neural Surface Reconstruction" (CVPR 2023)
NeVA
The open source implementation of "NeVA: NeMo Vision and Language Assistant"
nlp-in-3-weeks
Repository of the NLP in 3 weeks series starting 2023-12-05
ODISE
Official PyTorch implementation of ODISE: Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [CVPR 2023 Highlight]
opencv
Open Source Computer Vision Library
pybind11
Seamless operability between C++11 and Python
slt-techwrite
O'Reilly Technical Writing Course
Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
VideoLLM
VideoLLM: Modeling Video Sequence with Large Language Models
VILA
VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
VTimeLLM
Official PyTorch implementation of the paper "VTimeLLM: Empower LLM to Grasp Video Moments".