anki-code / job-titles

Normalized dataset of 70k job titles

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

job-titles

Normalized dataset of 70k job titles

Data Normalizations

The data is normalized in the following ways:

  • lowercase
  • - replaced with a <Space>
  • , removed

Caveats

  • Duplicates such as a and p mechanic and a&p mechanic
  • Non-English titles such as ab initio etl developer

See also

Contribute

Feel free to open a pull request fixing above listed caveats or any other enhancements.

Only edit job-titles.txt. After doing so run ./format.sh.

Attribution

This dataset is a collection of the following sources:

About

Normalized dataset of 70k job titles


Languages

Language:Shell 100.0%