fillerwriter / hhdcpython

Scripts and documentation for Hacks/Hackers DC workshop.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hacks/Hackers DC Python class instructions

It's your first week as a newsroom coder. A city reporter needs you to clean and summarize some data. She doesn't need anything fancy, but she does need it in a few hours.

Fortunately, one of your colleagues set up a repository with the dataset and an unfinished Python script with some notes. Unfortunately, the data is a real mess. Oh, and did I mention the data is in Spanish?

Now it's up to you to finish the job.

You'll make a copy your colleague's work by using Git's fork feature, install your copy on PythonAnywhere, then modify it to process the data.

By the end, you'll produce a clean CSV version of the data you can share with the reporter so she can analyze it in Excel. You'll also produce some basic analysis of the cleaned data to help her get started.

Prerequisities

You'll need to sign up with a couple of services to complete this job:

Getting set up for the first time

At least your charmingly cryptic co-worker David left you a place to start: A git repository with some data and the bones of a script to process it.

Of course, David was only following the practices established for any small data project at our company.

Fork the code

Forking is a way of making your own version of someone else's code, usually because you want to improve it or modify it for your own purposes. At our company, you should always fork before you work on a project.

Fork the repository for the class project so you have your own copy of the code to hack on.

Animation of forking a repository

(You might think it's weird to digress into Git and version control so quickly. But this tutorial is meant to mimic a professional setting, and you're going to struggle a bit with version control in a professional setting.)

Install the code on PythonAnywhere

Log in to PythonAnywhere.

The first screen is the "Consoles" tab. Under the "Start a new console" section at the top of the page, click "Bash".

This creates a basic terminal session where you can run and manage code.

Once you see a prompt that looks like 03:11 ~ $, you're ready to start typing in commands. Now it's time to get your copy of the code. Type the following into the terminal on your screen:

git clone https://github.com/<MYGITHUBUSERNAME>/hhdcpython

Note that you must replace <MYGITHUBUSERNAME> with your Github username.

Animation of creating a bash console

Editing and running the script

  • If you are in a console, click the "PythonAnywhere" logo in the upper left to go back to the main screen.
  • Click the "Files" tab.
  • Click the hhdcpython directory on the left side of the screen.
  • Click the process.py file on the right hand side of the screen.

A file editor will appear. You can now edit and run the script.

Animation of editing and running

Try running it now. We'll walk through editing the script and processing the data when we meet.

About

Scripts and documentation for Hacks/Hackers DC workshop.


Languages

Language:Python 100.0%