aahashemi / Udemy-Course-Price-Predictor

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About The Project

In this project, you can predict a Udemy course price by providing a list of keywords (words that may be used in the course title). Follow the Project Description to see what each of the files in this repository are used for. Or else, go to the How To Use It? section to use the AI tool.


  1. Project Description

  2. How To Use It?

Project Description

Database

The database contains data of 10000 Udemy courses. Such as id, title, is_paid, price, currency, and course URL. There are two types of databases in this repository.

  1. raw_udemy_databse.csv

  2. cleaned_udemy_databse.csv

    • The only difference between this CSV file and the raw_udemy_databse.csv file is that the stop words in the title column are removed.
STOP WORDS: They are commonly used in Text Mining and Natural Language Processing (NLP) to 
eliminate words that are so commonly used that they carry very little useful information.

Examples of stop words in English are “a”, “the”, “is”, “are” and etc. 

The two tables below illustrate the difference between raw_udemy_databse.csv and cleaned_udemy_databse.csv:

raw_udemy_databse.csv

id title is_paid price curreny price_string
567828 2022 Complete Python Bootcamp From Zero to Hero in Python 1 89.99 EUR €89.99

cleaned_udemy_database.csv

id title is_paid price curreny price_string
567828 complete python bootcamp zero hero python 1 89.99 EUR €89.99

Stop words “2022”, “From”, “to”, “in” are removed


word2vec_udemy.model

The word2vec_udemy.model is the vector representation model for the Udemy courses titles. This model computes the similarity between the courses' titles in the dataset. This model will be used for price prediction.


Ceate_Word2Vec_Model_From_Scratch.ipynb

A notebook tutorial to build a Word2Vec model from scratch in order to predict the keyword similarities in the Udemy course titles. This tutorial is designed for those who are interested in how I came up with the word2vec_udemy.model.


Udemy_Course_Price_Predictor.ipynb

A notebook tutorial that uses the pre-trained word2vec_udemy.model in order to predict the price of a Udemy course based on a list of keywords. Follow this tutorial for the step-by-step guide through.

How To Use It?

To use this tool all you need to do is to call the predictPrice() function in the Udemy_Course_Price_Predictor.ipynb.
Follow the Udemy_Course_Price_Predictor.ipynb tutorial to learn how to use the function

Example 1 - single keyword

predictPrice(keyWords='english',topn=4)

Output:

Top 4 key words with the highest similarity:  ['speaking', 'language', 'start', 'writing']
Predicted price:  74 Euro

Example 2 - multiple keywords

predictPrice(keyWords=['machine','learning'],topn=6)

Output:

Top 6 key words with the highest similarity: ['data', 'python', 'r', 'deep', 'science', 'tableau'] 
Predicted price:  76 Euro

Example 3 - error keyword

predictPrice(keyWords=['instagram','hamburger'],topn=2)

Output:

Error message: "word 'hamburger' not in vocabulary"

About


Languages

Language:Jupyter Notebook 100.0%