Omar Moustafa 's starred repositories

Hi-SAM

[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation

Language:PythonLicense:Apache-2.0Stargazers:168Issues:0Issues:0

DIS

This is the repo for our new project Highly Accurate Dichotomous Image Segmentation

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2126Issues:0Issues:0

Cradle

The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curation, in a standardized general environment with minimal requirements.

Language:PythonLicense:MITStargazers:1331Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:12579Issues:0Issues:0

RADAM

We propose a new method named Random encoding of Aggregated Deep Activation Maps (RADAM) for feature extraction from pre-trained Deep CNNs. The technique consists of encoding the output at different depths of the CNN using a Randomized Autoencoder, producing a single image descriptor

Language:PythonLicense:MITStargazers:32Issues:0Issues:0

Marigold

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:2067Issues:0Issues:0

Depth-Anything-V2

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Language:PythonLicense:Apache-2.0Stargazers:2640Issues:0Issues:0

Dlib_Windows_Python3.x

Dlib compiled binary (.whl) for Python 3.7-3.12 and Windows x64

License:BSL-1.0Stargazers:105Issues:0Issues:0

EALPR

A New Benchmark Dataset for Egyptian License Plate Detection and Recognition

Stargazers:14Issues:0Issues:0
Language:PythonLicense:MITStargazers:121Issues:0Issues:0

AniTalker

[ACM MM 2024] This is the official code for "AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding"

License:Apache-2.0Stargazers:1195Issues:0Issues:0

ultralyticsplus

Huggingface utilities for Ultralytics/YOLOv8

Language:PythonLicense:GPL-3.0Stargazers:76Issues:0Issues:0

tensorflow-onnx

Convert TensorFlow, Keras, Tensorflow.js and Tflite models to ONNX

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2267Issues:0Issues:0

tensorflow

An Open Source Machine Learning Framework for Everyone

Language:C++License:Apache-2.0Stargazers:184256Issues:0Issues:0

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonLicense:MITStargazers:5795Issues:0Issues:0

optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools

Language:PythonLicense:Apache-2.0Stargazers:2337Issues:0Issues:0

clustertabnet

Implementation of the table detection and table structure recognition deep learning model described in the paper "ClusterTabNet: Supervised clustering method for table detection and table structure recognition".

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2Issues:0Issues:0

RADIO

Official repository for "AM-RADIO: Reduce All Domains Into One"

Language:PythonLicense:NOASSERTIONStargazers:521Issues:0Issues:0

nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents

Language:PythonLicense:MITStargazers:8514Issues:0Issues:0

ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

Language:PythonLicense:Apache-2.0Stargazers:12541Issues:0Issues:0

geo-clip

This is an official PyTorch implementation of our NeurIPS 2023 paper "GeoCLIP: Clip-Inspired Alignment between Locations and Images for Effective Worldwide Geo-localization"

Language:PythonLicense:MITStargazers:94Issues:0Issues:0

PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Language:PythonLicense:AGPL-3.0Stargazers:4677Issues:0Issues:0

unitable

UniTable: Towards a Unified Table Foundation Model

Language:Jupyter NotebookLicense:MITStargazers:296Issues:0Issues:0

yolov9

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Language:PythonLicense:GPL-3.0Stargazers:8649Issues:0Issues:0

deepdoctection

A Repo For Document AI

Language:PythonLicense:Apache-2.0Stargazers:2396Issues:0Issues:0

table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

Language:PythonLicense:MITStargazers:2065Issues:0Issues:0

supervision

We write your reusable computer vision tools. 💜

Language:PythonLicense:MITStargazers:18103Issues:0Issues:0

segformer-tf-transformers

This repository demonstrates how to use TensorFlow based SegFormer model in 🤗 transformers package.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:31Issues:0Issues:0

Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Language:Jupyter NotebookLicense:MITStargazers:8643Issues:0Issues:0

surya

OCR, layout analysis, reading order, line detection in 90+ languages

Language:PythonLicense:GPL-3.0Stargazers:9288Issues:0Issues:0