ta-dd / cmzf-redataprocessing

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Real Estate Data Processing - Project for Data Processing in Python (JEM207)

Vojtěch Kania, Lukáš Novotný

Project summary

This project aims to automate scraping data from sreality using its public API. For this purpose, package for requesting API, decoding requests and storing into SQLite database was uploaded on PYPI. Furthermore, exploratory data analysis was performed on data obtained by this package.

Where to find the package

The source code is currently hosted on GitHub at: https://github.com/vojtechkaniaedu/re_data_processing/redadataprocessing

Detailed information on this package could be found on this site in README.md file

Binary installers for the latest released version are available at the Python Package Index (PyPI).

# PyPI
pip install redataprocessing

Where to find the package

Moreover, we have created an exemplary EDA which we performed on flats for sale category. Can be found in EDA - flats for sale file using SQLite containing the data also uploaded on the main branch. Data were downloaded using our source code. The EDA was done before the last version of package was finished. The purpose of the EDA notebook is to show users an example how to process this kind of data. Nevertheless, the EDA would differ for different categories (different columns) as well as for different data (different outliers/misisng values treatment).

Structure of the whole repo

Structure of the whole repo

About


Languages

Language:Jupyter Notebook 95.5%Language:Python 4.5%