We created a database based on the public dataset MovieLens-1M
Instead of the public data, we also retreive some additional informations from websites by using the web crawlers.
for reviewing the dataset on PostgresDB, we provide alternative raw data input using SQL file to import dataset into the PostgresDB (NOT RECOMMENDED )
psql -f " ml1m" -U username
1. crawl the additional informations
the presentation can be previewed here
we crawled the additional informations such as:
movie poster
directors
writers
stars
introductions & storylines
we remain the full dataset in our github repo here :)
usage:
cd preprocess
python get_csv.py && get_graph.py
download data from google drive link
its a .sqlite
file, just put it under directory sql_api_server
Website Service Instructions
A pure handmade MVC-structured website. Tornado on Python provides the Controller , the web pages are Templates (Views ), and the SQL-api server infers to Model .
Frontend is composed of Bootstrap + jQuery (#javascript
).
Backend is provided by Tornado (#python
)
SQL queries are safe and cannot be SQL injected :)
python version >= 3.6
tornado >= 6.0.0
pip3 install -r requirements.txt
for windows, simply click website_server/start.bat
.
for general OS systems:
cd website_server
python startserver.py
the website URL is default at http://127.0.0.1:8889 .
To change service IP address (default=localhost
):
cd website_server
python set_server_ip.py 123.456.789.012
then the website will be on http://123.456.789.012:8889
for windows, simply click sql_api_server/start.bat
.
for general OS systems:
cd sql_api_server
python correct_md5.py && startserver.py
notice that this sql api service can be only used on localhost for security issues. (http://127.0.0.1:8888 )