-
Install python3 per your OS instructions.
-
Create the venv:
python3 -m venv venv
Activate the venv:
source venv/bin/activate
-
Run the following command to install the project’s required libraries:
python -m pip install -r requirements.txt
Versions pip updates frequently, but versions greater than 10.x.x should work with this project.
In order to verify that everything is setup correctly, run the following command from the project root. pytest
python -m budget.FrequentExpenses
To run tests run: `pytest -k "module1" -s`
To run the file: `python -m budget.FrequentExpenses`
-
Last month’s spending data is in
data/spending_data.csv
, which is a spreadsheet with 3 columns for - Location, Category, and amount. For example, the first row contains:Alaska Air,Travel,-$115.75
. We want to analyze our spending habits in a few different ways. In this module, we are going to read in this file and display the categories with the most purchases in a graph.To read in the data, we’ll use the classes in the file named Expense.py. There are 2 classes -- Expense (which has a vendor, category, and amount) and Expenses (which has a list of type Expense and a sum of the amounts). Expenses also has a method read_expenses() which we’ll use to read the .csv file.
To start, open the file named
FrequentExpenses.py
in thebudget
directory, and addimport Expense
to the top of the file. -
Create a variable named expenses and set it equal to calling the Expenses() constructor. Then call the read_expenses() method on expenses and pass in the name of the file
data/spending_data.csv
. -
Create an empty list called spendingCategories. Then, create a for loop that iterates each Expense in the expenses. Inside the loop, we want to
append()
expense.category
tospendingCategories
. -
In order to use the Counter Collection,
import collections
at the top of the file. Then after the for loop, create a new variable calledspendingCounter
and set equal to passingspendingCategories
to thecollections.Counter()
constructor.If you printed the Counter with print(spendingCategories), you would see the following output:
Counter({'Eating Out': 8, 'Subscriptions': 6, 'Groceries': 5, 'Auto and Gas': 5, 'Charity': 2, 'Gear and Clothing': 2, 'Phone': 2, 'Travel': 1, 'Classes': 1, 'Freelance': 1, 'Stuff': 1, 'Mortgage': 1, 'Paycheck': 1, 'Home Improvements': 1, 'Parking': 1, 'Utilities': 1})
You can see it shows the category as the key and the number of times it was used as the value. With ‘Eating Out` as the most common expense which was done 8 times.
-
We can get only the top 5 most common categories by calling the
most_common()
method onspendingCounter
and passing in the value5
. Set the result equal to a variable calledtop5
. -
If you’ve used the
zip()
function before it combines 2 iterables (for example, combines two lists into a list of tuples). We can also usezip(*dictionary_variable)
to separate the keys and values of a dictionary into separate lists. Since we want to have 2 separate lists for the categories and their counts for the bar graph, let’s callzip(*top5)
and set the result equal to two variables -categories, count
. -
Add
import matplotlib.pyplot as plt
to the top of the file. Then at the end of the file, callfig,ax=plt.subplots()
to initializefig
as the Figure, or top level container for our graph. Andax
as the Axes, which contains the actual figure elements. -
Next, call
ax.bar()
with thecategories
andcount
lists as parameters. To add a title, call ax.set_title() and pass in the string '# of Purchases by Category'. -
Finally, to display the graph, call
plt.show()
.The resulting graph should be displayed:
To run tests run: `pytest -k "module2" -s`
To run the file: `python -m budget.BudgetList`
-
In the
budget
directory, open theBudgetList.py
file. Inside that file, create a class calledBudgetList
with onlypass
inside the class for now. -
Replace
pass
with a constructor that has two parameters -self, budget
. Then initialize the following class variables:self.budget
to the passed-inbudget
self.sum_expenses
to0
self.expenses
to an empty listself.sum_averages
to0
self.averages
to an empty list
-
Define an append method that has two parameters -
self
anditem
. Putpass
inside the method for now. -
Replace
pass
with anif
statement that checks ifself.sum_expenses
plus the passed-initem
is less thanself.budget
. Inside theif
block, callappend()
onself.expenses
and pass initem
. Also inside theif
block, additem
toself.sum_expenses
. -
After the
if
block, add anelse
block that callsappend()
onself.averages
and passes initem
. Also, increaseself.sum_averages
byitem
. -
Define a method called
__len__
that takes inself
as a parameter. Inside the method, return the sum of the length ofself.expenses
and the length ofself.averages
. -
After the BudgetList class, define a
main() function
. Inside ofmain()
, create amyBudgetList
variable and assign it to calling theBudgetList
constructor with a budget argument of1200
. -
Before we can use the Expense class to read in spending data,
import Expense
at the top of BudgetList.py -
Next, create a variable named expenses and set it equal to calling the
Expense.Expenses()
constructor. On the next line, call theread_expenses()
method onexpenses
and pass in the name of the filedata/spending_data.csv
. For this to work, we also need toimport Expense
at the top of the file. -
After reading the expenses, create a
for
loop that has an iterator calledexpense
and loops throughexpenses.list
. Inside the for loop, callappend()
, withexpense.amount
as an argument, onmyBudgetList
. -
Call print() to print out the string 'The count of all expenses: ' concatenated with the length of myBudgeList inside the print() call. Hint: Call the len() function with myBudgetList as an argument, then wrap that in a call to str() to convert to a string.
-
After the main function, create an
if
statement that checks if__name__
is equal to"__main__"
. If so, callmain()
.Now we can test that append() and len() are working for our BudgetList. Run
python -m budget.BudgetList
and the output should be"The count of all expenses: 37"
.
To run tests run: `pytest -k "module3" -s`
To run the file: `python -m budget.BudgetList`
-
Next, we want to create an iterator for BudgetList by implementing iter() and next() to iterate the expenses list first and then continue iterating the overages list. Once those are implemented and you can get an iterator from BudgetList, it will be an iterable. Inside the BudgetList class, at the bottom, define an iter method that has self as a parameter. Put
pass
inside the body of the method for now. -
Inside
__iter__()
, removepass
and replace it with settingself.iter_e
to calling theiter()
constructor withself.expenses
as an argument. On the next line, setself.iter_o
to calling theiter()
constructor withself.overages
as an argument. Finally to finish the method,return self
. -
After the iter method, define the method next() with
self
as a parameter. Putpass
inside the body of the method for now. -
Inside
__next__()
, removepass
and replace it with atry:
block. Inside thetry:
block,return
a call to__next__()
onself.iter__e
. On the next line add an except block, StopIteration as stop as the exception. Inside the except block,return
a call to__next__()
onself.iter__o
. -
We can now test that BudgetList works as an iterable by using it in a for loop. In main(), after the print statement, create a
for
loop that has an iterator calledentry
and loops throughmyBudgetList
. Inside the for loop, call print() withentry
as an argument.If we run
python BudgetList.py
, the output should be"The count of all expenses: 37"
followed by each of the 37 amounts. -
Now we want to show a bar graph comparing the expenses, overages, and budget totals. First, we need to add
import matplotlib.pyplot as plt
to the top of the file afterimport Expense
. -
Then at the end of main(), call
fig,ax=plt.subplots()
to initializefig
as the Figure, or top level container for our graph. Andax
as the Axes, which contains the actual figure elements. -
Create a variable called
labels
and set it equal to a list with the following values:'Expenses', 'Overages', 'Budget'
. -
Create a variable called
values
and set it equal to a list with the following properties frommyBudgetList
:sum_expenses
,sum_overages
, andbudget
. -
Next, call
ax.bar()
with thelabels
andvalues
lists as parameters. -
To add a title, call ax.set_title() and pass in the string 'Your total expenses vs. total budget'.
-
Finally, to display the graph, call
plt.show()
.
To run tests run: `pytest -k "module4" -s`
To run the file: `python -m budget.ExpenseCategories`
We want to create a pie chart that compares different spending categories. But first, we need to categorize our spending data. We went ahead and wrote a method to do this using a for loop in the Expense
class called categorize_for_loop()
. But now we’re wondering if this would be faster using set comprehension. Let’s write a method called categorize_set_comprehension() to test this. Then we can use the timeit
module to test which one is faster.
In Expense.py
, inside the Expenses
class, after categorize_for_loop()
, create a method called categorize_set_comprehension()
that has self
as a parameter and put pass
inside the method of the body for now.
Inside categorize_set_comprehension()
, create a variable called necessary_expenses
set equal to empty curly braces, which is where we’ll create the set comprehension
. Inside the curly braces, we want x for x in self.list
then on the next line we want a conditional that checks if: x.category
is equal to 'Phone'
or x.category
is equal to 'Auto and Gas'
or x.category
is equal to 'Classes'
or x.category
is equal to 'Utilities'
or x.category
is equal to 'Mortgage'
.
On the next line, create a variable called food_expenses
set equal to a similar set comprehension that checks if each category
is equal to 'Groceries'
or 'Eating Out'
.
Then, to categorize the remaining expenses, create a variable called unnecessary_expenses
. Set it equal to calling set()
with self.list
as a parameter. Then use set subtraction to subtract necessary_expenses and food_expenses
from that set. This should all be on one line of code.
Finally, to return the sets together, return a list with the following variables inside: necessary_expenses, food_expenses, unnecessary_expenses
.
In ExpenseCategories.py
, we went ahead and called expenses.categorize_for_loop()
. Now we want to call categorize_set_comprehension()
and see if we get the same results.
Create a variable named divided_set_comp
and set it equal to expenses.categorize_set_comprehension()
.
Add an if statement that checks if divided_set_comp
and divided_for_loop
are not equal. If they are not equal, print the following: 'Sets are NOT equal by == test'
.
If we run this, we should see nothing printed to the screen since all of the sets within the list should be equal.
We can also perform mathematical set operations in Python. For instance, another way of showing that two sets are equal is to check if both sets are subsets of each other.
To demonstrate this, let’s create a for
loop to look at each set in divided_set_comp and divided_for_loop
. We can use zip()
to return a list of tuples for a for loop like so: for a,b in zip(divided_for_loop, divided_set_comp)
. Put pass
inside the for loop for now.
Inside the for
loop replace pass
with a conditional that checks if a
is a subset of b
and b
is a subset of a
using the issubset()
method. Add a not
operator in front of the conditional, since we only want to print something if the equality test fails. Make sure you have parenthesis around the whole expression, otherwise it will only test not
on the first part. Inside the if
statement, print the following: "Sets are NOT equal by subset test"
.
To run tests run: `pytest -k "module5" -s`
To run the file: `python -m budget.ExpenseCategories`
We want to use the Python timeit
module to time whether categorizing expenses was faster using a for loop or set comprehension. First, we need to import timeit
at the top of the ExpenseCategories.py
file.
After the for loop for the subset test, call timeit.timeit()
with the following 4 arguments:
stmt = "pass"
This will eventually be the line of code we want to time the execution of.
setup =
'''
'''
This multi-line string will eventually hold the lines of code that are required for stmt to run.
number=100000
This is the number of executions to time.
globals=globals()
Now that we know how to use timeit.timeit()
, let’s pass in the actual code we want to time. Replace stmt = "pass"
with stmt = "expenses.categorize_for_loop()"
. Also set setup equal to the following multi-line string:
'''
from . import Expense
expenses = Expense.Expenses()
expenses.read_expenses('data/spending_data.csv')
'''
Wrap the entire timeit.timeit()
call from the previous task in a print()
statement. Then it will print out the total number of seconds to execute the statement
the specified number
of times.
If you test this by running python -m budget.ExpenseCategories
you should see around ~1.5 seconds printed out.
Now that we’ve set up the timer for expenses.categorize_for_loop()
, let’s set up the timer to time expenses.categorize_set_comprehension()
. Copy and paste the entire print(timeit.timeit(...))
code from the previous tasks. Then replace stmt = "expenses.categorize_for_loop()"
with stmt = "expenses.categorize_set_comprehension()"
.
If you test this by running python -m budget.ExpenseCategories
you should see around ~1.6 seconds printed out for the set comprehension method.
Set comprehension may be faster than a for loop in general for a single loop. However, we had 2 set comprehensions that each required looping to check separate conditionals whereas the for loop method only used one iteration to check the conditionals.
Now that we’ve determined which categorization method was faster, we want to create a pie chart comparing the expense totals for each category.
After the timeit()
code, call fig,ax=plt.subplots()
to initialize fig
as the Figure and ax
as the Axes.
Create a variable called labels
and set it equal to a list with the following values: 'Necessary', 'Food', 'Unnecessary'
.
Inside the divided_set_comp
list we have three sets of expenses divided by category. Now we want to create a list that has a sum for each of those expense amounts. Create a variable called divided_expenses_sum
and set it equal to an empty list.
Create a for
loop that has an iterator called category_exps
and loops through divided_set_comp
. Inside the for loop, we want to sum the expense amounts for each set using a list comprehension
and append that sum to the divided_expenses_sum
list. Inside the for loop, call divided_expenses_sum.append()
. Then inside the append()
, call sum()
. Inside sum()
, we want the list comprehension that returns x.amount for x in category_exps
.
Next, call ax.pie()
with the following arguments:
divided_expenses_sum
labels = labels
autopct = '%1.1f%%'
# (This will format the percentage.)
Finally, to display the graph, call plt.show()
.
To see the results yourself, you can run python -m budget.ExpenseCategories
from the top-level directory. You should see the pie graph pop up in another window automatically.