Yotam's starred repositories
intel-extension-for-transformers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
CrossLinked
LinkedIn enumeration tool to extract valid employee names from an organization through search engine scraping
Transformers-for-NLP-2nd-Edition
Transformer models from BERT to GPT-4, environments from Hugging Face to OpenAI. Fine-tuning, training, and prompt engineering examples. A bonus section with ChatGPT, GPT-3.5-turbo, GPT-4, and DALL-E including jump starting GPT-4, speech-to-text, text-to-speech, text to image generation with DALL-E, Google Cloud AI,HuggingGPT, and more
tokenization
A comprehensive deep dive into the world of tokens
factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
LinkedIn-Job-Scraper
LinkedIn scraper to retrieve and store a live stream of job postings
llm-embedding
Finetune Malaysian LLM for Malaysian context embedding task.
CharacterBERT-DR
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022
X-RetroMAE
Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
company-profile-scrapper
POC of a multiprocesses web scrapper for Google search and Linkedin
hubness-reduction-improves-sbert-semantic-spaces
Hubness Reduction Improves Sentence-BERT Semantic Spaces