Inuwa Mobarak Abraham's repositories
Image-captioning-ViT
Image Captioning Vision Transformers (ViTs) are transformer models that generate descriptive captions for images by combining the power of Transformers and computer vision. It leverages state-of-the-art pre-trained ViT models and employs technique
depth-estimation-DPT
This repository contains the implementation of Depth Prediction Transformers (DPT), a deep learning model for accurate depth estimation in computer vision tasks. DPT leverages the transformer architecture and an encoder-decoder framework to capture fine-grained details, model long-range dependencies, and generate precise depth predictions.
detecting-tables-in-documents
This repository contains code and resources for detecting tables in various types of documents using machine learning and computer vision techniques.
GraphQL-agric-management-sys
A GraphQL project for an agriculture management system. The system will handle data related to farms, crops, weather information, and equipment. It is more or less a practice project using Python with Flask for the backend, SQLAlchemy for database management, and Graphene for GraphQL integration.
KOSMOS-2
KOSMOS-2 is designed to handle text and images simultaneously, and redefine the way we perceive and interact with multimodal data, KOSMOS-2 is built on a Transformer-based causal language model architecture, similar to other renowned models like LLaMa-2 and Mistral AI's 7b model.
Meta-Llama-3-8B
Experiments with the Meta-Llama-3-8B
MoE-LLaVA-inference
The ever-evolving landscape of artificial intelligence has presented an intersection of visual and linguistic data through large vision-language models (LVLMs). MoE-LLaVA is one of these models which stands at the forefront of revolutionizing how machines interpret and understand the world, mirroring human-like perception. However, the challenge s
TikTok-depth-anything
State-of-the-art monocular depth estimation (MDE) model from TikTok, in collaboration with the University of Hong Kong, Zhejiang Lab, and Zhejiang University, has open-sourced Depth Anything, inviting collaboration from the community!
mobius-text-2-img
A Flask API endpoint that receives a prompt and returns the generated image
pest-prediction
This project aims to detect pests in images using a deep learning model trained with fastai.
transformer-models-publications
A collection of links and references to my publicly published transformer model articles
article-images-generator
Generate images for articles, blogs, and publications using NLP, Segmind API, Blip captioning and Llama.Cpp
cookbook
Open-source AI cookbook
crow-cpp-docAI
This standalone C++ application will include a HTML form for user input, and a C++ server to handle the form submission, perform inference, and return the result to the client.
data-analysis-with-pyhon-course
My complete course for data analysis with python. From Statistics to the end
Docker_Served_File_IO_API
Document For Talk On How To Docker Serve A File IO API
falcon-task-management-API
We create a task management API with Falcon framework where users can create, retrieve, update, and delete tasks with task support and tips from AI. We'll also include validation and error handling.
huggingface-models-collections
Inference, demo, blogs, description of some popular transformer models
inuwamobarak
Config files for my GitHub profile.
MoE-LLaVA
Mixture-of-Experts for Large Vision-Language Models