This project is a part of Part 3 The Backend: Databases & Applications of Udacity Full Stack Nanodegree course.
NOTE:
This project was built on Windows 10 OS. All the interaction with the Virtual Machine was done through Command Prompt on Windows.
(Do not use Git Bash for this project. It simply won't work.)
- Python
The source code for this project is written in Python v3.6.1 programming language. For direct download of version 3.6.1 click here.
- Code Validators
The source code was checked against bugs and quality using Pylint tool, PEP8 tool and PEP8 online check.
To install Pylint:
pip3 install pylint
To check Python file using Pylint:
pylint fileName.py
To install pep8:
pip3 install pep8
To check Python file using pep8:
pep8 fileName.py
- Virtual Box
To run the Virtual Machine, first, we need to download it and then install it. Virtual Box can be downloaded from here.
- Vagrant
Vagrant is the software that configures the VM and lets you share files between your host computer and the VM's filesystem. Vagrant can be downloaded from here.
psycopg2
python module is required. To install:
pip3 install psycopg2
This project is an information reporting tool which provides information regarding the most popular articles, the most popular authors and the most logged errors in a day from a news database.
Following are the views that were created as part of the news database:
- MostViewedPaths
create view MostViewedPaths as (
select path, count(*) as views
from log
where path like '/_%'
group by path
order by views desc);
- articleShortInfo
create view articleShortInfo as (
select a.title, c.name, b.views
from articles as a join MostViewedPaths as b
on concat('/article/', a.slug) = b.path
join authors as c
on a.author = c.id
order by views desc);
- LogRequests
create view LogRequests as (
select time::timestamp::date, count(*) as total
from log
group by time::timestamp::date
order by total desc);
- ErrorRequests
create view ErrorRequests as (
select time::timestamp::date, count(*) as errors
from log
where status like '4%' or status like '5%'
group by time::timestamp::date
order by errors desc);
The newsdb.py file contains the implementations for the three functions:
get_most_popular_articles()
get_most_popular_authors()
get_most_logged_errors()
In each of the these functions, a new Connection object and Cursor object are created as:
DBNAME = "news"
conn = psycopg2.connect(database=DBNAME)
cursor = conn.cursor()
Then, the query is run using the cursor.execute()
function.
The data is fetched using cursor.fetchall()
and stored in a local variable which is returned by the respective functions.
Finally, the connection is closed using conn.close()
within each function.
- Copy the python files to the folder which contains the newsdata.sql file.
- If the folder doesn't already contain a Vagrantfile. Run the following command to create one.
vagrant init
- To start the virtual machine, from your local directory, run the following command:
vagrant up
- Then to drop a full-fledged SSH session, run the following command:
vagrant ssh
- Type
psql
to switch to the interactive terminal for working with PostgreSQL. - Now create a new database(if it doesn't exist already):
create database news;
-
Then exit with Ctrl + D.
-
Run
psql -d news -f newsdata.sql
to create the tables authors, articles and log. This will exit the psql terminal. -
Start the psql terminal with
psql
and move into the news database with\c news
. -
Now open up another Command Prompt. Move to the project directory. Run
vagrant ssh
to move into the VM. -
Run the newsdata.py file with the following command after moving into the file's location to get the output:
python newsdata.py
To know how the output looks check here.
The content of this repository is licensed under MIT.