Rafael J P D (rafaelpezzuto)

rafaelpezzuto

Geek Repo

Location:{Santos, Jundiaí} - SP - Brazil

Home Page:http://rafaelpezzuto.github.io

Github PK Tool:Github PK Tool

Rafael J P D's repositories

multi-sensor-data-collection

MultiSensor Data Collection is a mobile app that seamlessly gathers data from various sensors, including GPS, camera, and audio, allowing you to create a comprehensive dataset locally and effortlessly transmit it to the cloud for further analysis and storage.

Language:Jupyter NotebookStargazers:5Issues:0Issues:0
Language:PythonLicense:MITStargazers:1Issues:0Issues:0

a-PyTorch-Tutorial-to-Image-Captioning

Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning

License:MITStargazers:0Issues:0Issues:0

BMT

Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)

License:MITStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

city-surfaces

CitySurfaces semantic segmentation of sidewalk surfaces

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

doccano

Open source annotation tool for machine learning practitioners.

License:MITStargazers:0Issues:0Issues:0

google-research

Google Research

License:Apache-2.0Stargazers:0Issues:0Issues:0

languagetool

Style and Grammar Checker for 25+ Languages

License:LGPL-2.1Stargazers:0Issues:0Issues:0

llama

Inference code for genesis models

License:GPL-3.0Stargazers:0Issues:0Issues:0

manim

A community-maintained Python framework for creating mathematical animations.

License:MITStargazers:0Issues:0Issues:0
License:Apache-2.0Stargazers:0Issues:0Issues:0

MDVC

PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)

Stargazers:0Issues:0Issues:0

multimodal

TorchMultimodal is a PyTorch library for training state-of-the-art multimodal multi-task models at scale.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

ollama

Get up and running with Llama 3, Mistral, Gemma, and other large language models.

License:MITStargazers:0Issues:0Issues:0

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

License:MITStargazers:0Issues:0Issues:0

portuguese-bert

Portuguese pre-trained BERT models

License:NOASSERTIONStargazers:0Issues:0Issues:0

PraCegoVer

#PraCegoVer is a multi-modal dataset containing images associated to Portuguese captions based on posts from Instagram.

Stargazers:0Issues:0Issues:0

pytorch-image-models

PyTorch image models, scripts, pretrained weights -- ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more

License:Apache-2.0Stargazers:0Issues:0Issues:0

rafaelpezzuto.github.io

My personal website

Language:HTMLStargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0
Language:JavaScriptLicense:NOASSERTIONStargazers:0Issues:0Issues:0

tile2net

Automated mapping of pedestrian networks from aerial imagery tiles

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

trax

Trax — Deep Learning with Clear Code and Speed

License:Apache-2.0Stargazers:0Issues:0Issues:0

VideoClickCapture

This repo is a script that to capture the x,y positions when the user click in the displayed video

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

vision

Datasets, Transforms and Models specific to Computer Vision

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

vit-pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

License:MITStargazers:0Issues:0Issues:0

Xmodal-Ctx

Official PyTorch implementation of our CVPR 2022 paper: Beyond a Pre-Trained Object Detector: Cross-Modal Textual and Visual Context for Image Captioning

Stargazers:0Issues:0Issues:0

xmodaler

X-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).

License:NOASSERTIONStargazers:0Issues:0Issues:0

yolov10

YOLOv10: Real-Time End-to-End Object Detection

License:AGPL-3.0Stargazers:0Issues:0Issues:0