inuwamobarak

Inuwa Mobarak Abraham's repositories

Image-captioning-ViT

Image Captioning Vision Transformers (ViTs) are transformer models that generate descriptive captions for images by combining the power of Transformers and computer vision. It leverages state-of-the-art pre-trained ViT models and employs technique

Language:Jupyter Notebook28 2 1

nougat

Nougat is a Meta AI's revolutionary OCR model designed to transcribe scientific PDFs into an easy-to-use Markdown format.

Language:Jupyter Notebook22 10

OWLv2

Introducing OWLv2: Google's Breakthrough in Zero-Shot Object Detection

Language:Jupyter Notebook11 10

depth-estimation-DPT

This repository contains the implementation of Depth Prediction Transformers (DPT), a deep learning model for accurate depth estimation in computer vision tasks. DPT leverages the transformer architecture and an encoder-decoder framework to capture fine-grained details, model long-range dependencies, and generate precise depth predictions.

Language:Jupyter Notebook6 20

detecting-tables-in-documents

This repository contains code and resources for detecting tables in various types of documents using machine learning and computer vision techniques.

Language:Jupyter Notebook6 20

GraphQL-agric-management-sys

A GraphQL project for an agriculture management system. The system will handle data related to farms, crops, weather information, and equipment. It is more or less a practice project using Python with Flask for the backend, SQLAlchemy for database management, and Graphene for GraphQL integration.

Language:PythonApache-2.04 10

KOSMOS-2

KOSMOS-2 is designed to handle text and images simultaneously, and redefine the way we perceive and interact with multimodal data, KOSMOS-2 is built on a Transformer-based causal language model architecture, similar to other renowned models like LLaMa-2 and Mistral AI's 7b model.

Language:Jupyter Notebook3 10

Meta-Llama-3-8B

Experiments with the Meta-Llama-3-8B

Language:Jupyter Notebook3 10

MoE-LLaVA-inference

The ever-evolving landscape of artificial intelligence has presented an intersection of visual and linguistic data through large vision-language models (LVLMs). MoE-LLaVA is one of these models which stands at the forefront of revolutionizing how machines interpret and understand the world, mirroring human-like perception. However, the challenge s

Language:Jupyter Notebook3 10

TikTok-depth-anything

State-of-the-art monocular depth estimation (MDE) model from TikTok, in collaboration with the University of Hong Kong, Zhejiang Lab, and Zhejiang University, has open-sourced Depth Anything, inviting collaboration from the community!

Language:Jupyter Notebook3 10

ViTMatte

ViTMatte is a state-of-the-art image matting model. It leverages plain Vision Transformers (ViTs) to accurately estimate the foreground object in images and videos. We see ViTMatte, its architecture, practical implementation steps, and its contributions to th

Language:Jupyter Notebook3 1 1

Granite-3-0

Language:Jupyter Notebook100

mobius-text-2-img

A Flask API endpoint that receives a prompt and returns the generated image

Language:Jupyter Notebook1 10

pest-prediction

This project aims to detect pests in images using a deep learning model trained with fastai.

Language:Jupyter Notebook1 10

transformer-models-publications

A collection of links and references to my publicly published transformer model articles

Language:Jupyter NotebookApache-2.01 10

article-images-generator

Generate images for articles, blogs, and publications using NLP, Segmind API, Blip captioning and Llama.Cpp

Language:Python000

celeryRabbitMQ

Language:Python000

ckd

The CKD (Chronic Kidney Disease) App is a mobile application designed to help with chronic kidney disease and manage their condition more effectively.

Language:Jupyter Notebook010

cookbook

Open-source AI cookbook

Apache-2.0000

crow-cpp-docAI

This standalone C++ application will include a HTML form for user input, and a C++ server to handle the form submission, perform inference, and return the result to the client.

Language:C++010

data-analysis-with-pyhon-course

My complete course for data analysis with python. From Statistics to the end

Language:Jupyter NotebookCC0-1.0010

Docker_Served_File_IO_API

Document For Talk On How To Docker Serve A File IO API

Language:HTML000

falcon-task-management-API

We create a task management API with Falcon framework where users can create, retrieve, update, and delete tasks with task support and tips from AI. We'll also include validation and error handling.

Language:PythonApache-2.0020