Alymostafa

Aly Mostafa's repositories

Instruction_based_attack

Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs

Language:Python400

DeMemorization

[EMNLP 2023] Preserving Privacy Through Dememorization: An Unlearning Technique For Mitigating Memorization Risks In Language Models

Language:Jupyter Notebook100

GOF-Qur-an-QA-2022-Shared-Task-Code

Language:Jupyter Notebook000

arabic-stop-words

Largest list of Arabic stop words on Github. أكبر قائمة لمستبعدات الفهرسة العربية على جيت هاب

MIT000

bustub

The BusTub Relational Database Management System (Educational)

Language:C++MIT010

Clustring_Practice

Language:Jupyter Notebook010

DeepCASE

Original implementation and resources of DeepCASE as in the S&P '22 paper

Language:PythonMIT000

DeepLog

Pytorch Implementation of DeepLog.

MIT000

emoji-regex

A regular expression to match all Emoji-only symbols as per the Unicode Standard.

MIT000

fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

MIT000

fastai

The fastai deep learning library

Language:Jupyter NotebookApache-2.0010

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Apache-2.0000

knowledge-unlearning

[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models

000

llm-attacks

Universal and Transferable Attacks on Aligned Language Models

Language:PythonMIT000

LMTracker

The repository implement the LMTracker model based on paper: LMTracker: Lateral movement path detection based on heterogeneous graph embedding

MIT000

metaseq

Repo for external large-scale work

Language:PythonMIT000

mimir

Python package for measuring memorization in LLMs.

Language:Jupyter NotebookMIT000

notebooks

Notebooks using the Hugging Face libraries 🤗

Apache-2.0000

Nystromformer

000

OpenNMT-py

Open Source Neural Machine Translation in PyTorch

MIT000

parsers

GPL-2.0000

privacy

Library for training machine learning models with privacy for training data

Apache-2.0000

Sa_2_CRM

Language:JavaScript020

scattertext

Beautiful visualizations of how language differs among document types.

Language:PythonApache-2.0010

scholarly

Retrieve author and publication information from Google Scholar in a friendly, Pythonic way

Unlicense000

stweet

Advanced python library to scrap Twitter (tweets, users) from unofficial API, fully covered by integration tests

MIT000

text-image-binarization

An implementation of the paper 'Efficient illumination compensation techniques for text images'

Language:Python010

torch-cam

Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM)

Language:PythonMIT010

trl

Train transformer language models with reinforcement learning.

Language:Jupyter NotebookApache-2.0000

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

MIT000