Yotam (yotamnahum)

yotamnahum

Geek Repo

Company:@Samplead

Github PK Tool:Github PK Tool

Yotam's starred repositories

code2flow

Pretty good call graphs for dynamic languages

Language:PythonLicense:MITStargazers:3930Issues:78Issues:72

intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Language:PythonLicense:Apache-2.0Stargazers:2109Issues:28Issues:165

html2text

Convert HTML to Markdown-formatted text.

Language:PythonLicense:GPL-3.0Stargazers:1801Issues:26Issues:222

CrossLinked

LinkedIn enumeration tool to extract valid employee names from an organization through search engine scraping

Language:PythonLicense:GPL-3.0Stargazers:1243Issues:29Issues:17

Transformers-for-NLP-2nd-Edition

Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL-E including jump starting GPT-4, speech-to-text, text-to-speech, text to image generation with DALL-E, Google Cloud AI,HuggingGPT, and more

Language:Jupyter NotebookLicense:MITStargazers:783Issues:22Issues:4

MergeLM

Codebase for Merging Language Models (ICML 2024)

FlashRank

Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cross-encoders and more. Created by Prithivi Da, open for PRs & Collaborations.

Language:PythonLicense:Apache-2.0Stargazers:581Issues:6Issues:25

AnglE

Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard

Language:PythonLicense:MITStargazers:446Issues:10Issues:46

xllm

🦖 X—LLM: Cutting Edge & Easy LLM Finetuning

Language:PythonLicense:Apache-2.0Stargazers:372Issues:3Issues:11

TLM

ICML'2022: NLP From Scratch Without Large-Scale Pretraining: A Simple and Efficient Framework

Language:PythonLicense:MITStargazers:256Issues:5Issues:19

tokenization

A comprehensive deep dive into the world of tokens

Language:PythonLicense:MITStargazers:212Issues:3Issues:0

d3graph

Creation of interactive networks using d3 Javascript

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:171Issues:7Issues:33

geograpy3

Extract place names from a URL or text, and add context to those names -- for example distinguishing between a country, region or city.

Language:PythonLicense:Apache-2.0Stargazers:122Issues:5Issues:59

factoid-wiki

Dense X Retrieval: What Retrieval Granularity Should We Use?

LinkedIn-Job-Scraper

LinkedIn scraper to retrieve and store a live stream of job postings

flaxkv

🗲 A high-performance on-disk dictionary.

Language:PythonLicense:Apache-2.0Stargazers:27Issues:3Issues:12

gbswt5

CharFormer(Tay et al., 2022; Gradient-based Subword Tokenizer + T5) model implementation for Huggingface Transformers

Language:PythonLicense:Apache-2.0Stargazers:19Issues:0Issues:0

llm-embedding

Finetune Malaysian LLM for Malaysian context embedding task.

Language:Jupyter NotebookStargazers:19Issues:4Issues:0

CharacterBERT-DR

The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

Language:PythonLicense:Apache-2.0Stargazers:14Issues:5Issues:4

longBert

长文本相似度模型

Language:PythonLicense:Apache-2.0Stargazers:14Issues:1Issues:1

X-RetroMAE

Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder

Language:PythonStargazers:8Issues:2Issues:0

BiPFT

This is the implementation of our AAAI2024 paper: BiPFT: Binary Pre-trained Foundation Transformer with Low-rank Estimation of Binarization Residual Polynomials.

Language:PythonStargazers:8Issues:3Issues:0

MURAL

A Multi-Granularity-Aware Aspect Learning Model for Multi-Aspect Dense Retrieval

Language:PythonLicense:Apache-2.0Stargazers:6Issues:0Issues:0

irelease

Library that automates releasing your Github python package at Pypi.

Language:PythonLicense:NOASSERTIONStargazers:2Issues:4Issues:0

company-profile-scrapper

POC of a multiprocesses web scrapper for Google search and Linkedin

Language:PythonLicense:MITStargazers:1Issues:2Issues:0

hubness-reduction-improves-sbert-semantic-spaces

Hubness Reduction Improves Sentence-BERT Semantic Spaces

Language:PythonLicense:Apache-2.0Stargazers:1Issues:2Issues:0

clustsum

Unsupervised extractive summary using sentence embeddings and clustering.

Language:PythonLicense:MITStargazers:1Issues:2Issues:0
Language:PythonStargazers:1Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

repo-tree

A simple python package to display a repository tree structure. Pythonic alternative to linux tree command.

Language:PythonLicense:Apache-2.0Stargazers:1Issues:1Issues:0