Beast code in Giters

Rafael J P D's repositories

multi-sensor-data-collection

MultiSensor Data Collection is a mobile app that seamlessly gathers data from various sensors, including GPS, camera, and audio, allowing you to create a comprehensive dataset locally and effortlessly transmit it to the cloud for further analysis and storage.

Language:Jupyter Notebook500

sideseeing-tools

Language:PythonMIT100

a-PyTorch-Tutorial-to-Image-Captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

MIT000

BMT

Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)

MIT000

cited-references

Language:Jupyter Notebook000

city-surfaces

CitySurfaces semantic segmentation of sidewalk surfaces

BSD-3-Clause000

doccano

Open source annotation tool for machine learning practitioners.

MIT000

google-research

Google Research

Apache-2.0000

languagetool

Style and Grammar Checker for 25+ Languages

LGPL-2.1000

llama

Inference code for genesis models

GPL-3.0000

manim

A community-maintained Python framework for creating mathematical animations.

MIT000

mdetr

Apache-2.0000

MDVC

PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)

000

multimodal

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

BSD-3-Clause000

ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

MIT000

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

MIT000

portuguese-bert

Portuguese pre-trained BERT models

NOASSERTION000

PraCegoVer

#PraCegoVer is a multi-modal dataset containing images associated to Portuguese captions based on posts from Instagram.

000

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

Apache-2.0000

rafaelpezzuto.github.io

My personal website

Language:HTML000

scielo_bibliometrics

Language:Jupyter Notebook000

scms-citations

Language:JavaScriptNOASSERTION000

tile2net

Automated mapping of pedestrian networks from aerial imagery tiles

BSD-3-Clause000

trax

Trax — Deep Learning with Clear Code and Speed

Apache-2.0000

VideoClickCapture

This repo is a script that to capture the x,y positions when the user click in the displayed video

Language:Jupyter Notebook000

vision

Datasets, Transforms and Models specific to Computer Vision

BSD-3-Clause000

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

MIT000

Xmodal-Ctx

Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

000

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

NOASSERTION000

yolov10

YOLOv10: Real-Time End-to-End Object Detection

AGPL-3.0000