rykerdz / algerian-names-dataset

DZNS: Largest Dataset of Algerian names and surnames

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Algerian Names Dataset

This repository contains two datasets providing a rich collection of Algerian names and surnames:

  • Surnames: A dataset of over 19,000 unique Algerian surnames in both Latin and Arabic script.
  • Names: A dataset of over 12,000 unique Algerian names in both Latin and Arabic script. Name variations (e.g., Mohamed, Mohammed) are included.

These datasets offer a fantastic resource for anyone with an interest in Algerian names!

Things to Keep in Mind:

  • Designed for Learning: These datasets are primarily for educational purposes. If you're exploring language analysis, researching Algerian names, or undertaking similar projects, you'll find these datasets useful.
  • Let's Chat: To ensure responsible use, please contact the repository maintainer @ youcef.amoura@etu.usthb.dz if you'd like to use the datasets. Briefly describe your project, and we'll get you set up!

Data Format:

Both datasets are in CSV format with the following columns:

  • latin_name
  • arabic_name

Sample Data ((100 entries each)):

Names (algerian_names.csv):

Latin Script Arabic Script
MONCEF ISLEM منصف إسلام
MELISSA ميليسا
CHAHD شهد
RIHAM رهام
CERINE YASMINE سيرين ياسمين
... ...

Surnames (algerian_surnames.csv):

Latin Script Arabic Script
HAMSI حمسي
BOULKEDRA بولقدرة
BERICHI بريشي
MEHALI محالي
BOUSSAKOU بوساقو
... ...

The sample files are located in the 'sample_data' folder.

Collaborators Welcome!

We'd love to expand this dataset with contributions from the community. If you have additional Algerian names and surnames to share, here's how to get involved:

  1. Fork this repository.
  2. Make additions to the dataset file(s).
  3. Submit a pull request.

About

DZNS: Largest Dataset of Algerian names and surnames