ihorm5 / Task

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Task

In order to use this project you need to install Python3. For the first task(emoticons.py) no additional libs are needed. For the second task, you will need to install requirements from the requirements.txt file. You will also need PostgresSQL server running and create tables like in the filecreate_db.sql

Additional notes:

I have solved the second problem first splitting the original file into smaller chunks with size 400K lines. In order to split data use command from file splitter. After that I have used Python multiprocessing in order to paralell the calculations. The most challenging part was possibly "uniting" broken groups that were in two different files. I have done this using the second table unsorted_info. You can check the realisation in the function: process_splitted_elements()

About


Languages

Language:Python 100.0%