tblount / bash-workshop

Example implementations of code written for "How To Be an Effective Programmer: Linux, Command Lines, and Editors"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Workshop Reference

This repository contains example implementations for the UAB ACM workshop "How To Be an Effective Programmer: Linux, Command Lines, and Editors" in both Python 3 and Java 8. This portion of the workshop showcases three simple programs and how they might be composed together with the command line to achieve some interesting and useful behaviour.

You may safely disregard this paragraph if you intend on largely copying and pasting commands as you follow along. Python and Java implementations are in the python and java folders, respectively. All commands listed, unless otherwise stated, work for both bash and Windows cmd. All commands assume that your present working directory is the root of this repository. If you have no interest in eventually composing and interchanging both python and java programs, you may save yourself some typing by changing into either directory. In that case, you may drop the python/ prefix in python commands, and drop the -cp java portion of java commands. For example: python -u python/generator.py becomes python -u generator.py and java -cp java Generator becomes java Generator.

A note on python

The python code in this repository was written for python 3. If you're running a system with multiple installations of python, there's a good chance that python will execute Python 2. In which case you'll likely need to use python3 instead for the remainder of the workshop. Check your version of python with

python --version

A note on java

We won't touch on compiling java programs below. If you intend on writing the code described here yourself, you'll have to recompile your files whenever they're changed. If you're simply following along, compile all of the java source files in the java/ directory with the command below.

javac java/*.java

stdin, stdout, and stderr

Command-line programs interface via three streams of text: stdin, stdout, and stderr. A program gets its input via stdin, its typical results are printed to stdout, and its errors are directed to stderr. The command-line can redirect these streams between programs, and between programs and files.

Following are a couple brief introductions to some stream operators.

Redirecting output to a file

The > operator can be used to route stdout or stderr to a file. This is a good time to introduce our first program, the generator.


Program: Generator

This program lives in either of generator.py and Generator.java Its job is to simply generate test data. This implementation prints the numbers of the range [0, 29] to stdout, occasionally printing an error to stderr instead. Invoke with

python python/generator.py

or

java -cp java Generator

For brevity, later commands will primarily invoke the python variant. They are interchangeable.


Imagine the data produced by Generator might be useful to us later. Rather than manually copying its output and pasting it into a file, we can simply redirect its output to a file, in this case named output.txt.

python python/generator.py > output.txt

Note that errors were printed instead of being written to the file. The > operator (by default) only redirects stdout. If you are interested in capturing errors, you may prepend 2 as shown below:

python python/generator.py 2> errors.txt
  • but what if you want both? No problem.
python python/generator.py > output.txt 2> errors.txt

A note: on the command-line, stdin is referred to as 0, stdout by 1, and stderr by 2. So command > file is implicitly equivalent to command 1> file.

Imagine you're writing the output of some long-running process to a log for later review. It would probably be a mistake to use >, as it rewrites the entire file. Enter the >> operator. It merely appends output to a file. Much more suitable for a long-running process.

python python/generator.py 2>> errors.txt

Redirecting a file into stdin

So we have our test data from last time saved. How do we use it? We could always copy the contents of our file to our clipboard and paste it into another running program. But why would we? The < operator redirects the file on the right into the stdin of the program on the left. Now would be a good time to have such a program to consume our data.


Program: Transformer

This program lives in transformer.py and Transformer.java. Its job is to process numbers fed into its stdin and print the result to stdout. For simplicity, it just squares the numbers. It stops running when anything other than a number is fed into it. Invoke with

python python/transformer.py

or

java -cp java Transformer

We can now use our test data with Transformer by executing the following:

python python/transformer.py < output.txt

Piping stdout into another program's stdin

As of right now, our pipeline looks like the following

  1. Generator => a file
  2. that file => Transformer

If we don't need to keep our test data around for later use, we can skip the file entirely! Enter the | (pipe) operator. It takes the left program's stdout and pipes it directly into the right program's stdin. The stdout of all programs in the pipeline are combined. See below.

python -u python/generator.py | python python/transformer.py

Now our pipeline looks a bit shorter:

Generator => Transformer

An aside: you may have noticed the additional `-u` switch above. The streams we cover here can be *buffered*, which means that data is saved and sent in batches. The size of said buffer is completely up to the program in question. By default, python's buffer size is large enough that the small amount of data being printed here results in the programs in our pipeline being processed serially, rather than concurrently. Future python commands will use the `-u` switch if they're on the sending end of a pipe operator. Java users need not worry here.

If you want to pipe both stdout and stderr into another program, you may use the following:

python -u python/generator.py |& python python/transformer.py

Do note, however, that Transformer stops reading input after the first failure to parse an integer.

Combining pipes and redirection

Following are some examples combining pipes and redirection

  • redirect transformed output to a file
python -u python/generator.py | python python/transformer.py > output2.txt
  • redirect both stderrs of multiple programs to the same file
python -u python/generator.py | python python/transformer.py 2> errors.txt
  • redirect the stderrs of multiple programs individually to separate files
python -u python/generator.py 2> errors1.txt | python python/transformer.py 2> errors2.txt

Piping stderr

Here's where things really start to heat up. Imagine you have a long-running process running on a production server. It could be a web server, a Minecraft server, or anything with a command-line interface. It may occasionally encounter a non-fatal error and print a message to stderr. Unless you want to manually check server logs, it'd probably be a good idea to be notified in such a scenario. If the process wasn't written by you, it may not be feasible to modify it to send an email or notification in the case of such an error.

Here's where we'll introduce the final program, ErrorHandler.


Program: ErrorHandler

This program lives in error_handler.py and ErrorHandler.java. Its responsiblity is to receive errors via stdin and notify us. For simplicity's sake, the workshop version simply reprints each line given to it. A more robust version is provided but won't be covered, in text_errors_threaded.py. For now, use your imagination.


To accomplish this, we'll need to be able to pipe stderr into another program's stdin. As covered earlier, the pipe operator can only redirect either stdout or stdout and stderr to another program. We can work around this by reassigning stdout and stderr.

Consider the pseudocode below for swapping two variables:

a = 1
b = 2

temp = a
a = b
b = temp

Note that we must assign a to temporary variable in order to accomplish the swap. We'll perform a similar operation to accomplish swapping stdout and stderr:

<command1> 3>&1 1>&2 2>&3 | <command2>

The > operator doesn't only work on files! Instead, it operates on file descriptors. Without getting into too much detail, that means we aren't limited to files on our hard drive. 3>&1 is synonymous with the temp = a line in our pseudocode, and reads as 'redirect the stream 3 to file descriptor 1'. Stream 3 doesn't exist yet, so it is created on the fly. File descriptor 1 is stdout, as mentioned previously. 1>&2 maps stdout to stderr, and 2>&3 maps stderr to wherever &3 points to. That was a lot to digest, so let's apply it.

python -u python/generator.py 3>&1 1>&2 2>&3 | python python/error_handler.py`

Notice that every error emitted by Generator is immediately picked up and 'handled' by ErrorHandler. We could now implement any error-handling logic we want within ErrorHandler. Do you want to send a text to your phone? Do you want a slack notification in your team's bugs channel? Do you want an email sent to the CTO of whatever company wrote the buggy software? The possiblities are endless.


Following are more examples, tying together previous examples.

  • error handling for transformed data
(python -u python/generator.py | python python/transformer.py) 3>&1 1>&2 2>&3 | python python/error_handler.py
  • redirect transformed data to file
((python -u python/generator.py | python python/transformer.py) 3>&1 1>&2 2>&3 | python python/error_handler.py) 2> output.txt

About

Example implementations of code written for "How To Be an Effective Programmer: Linux, Command Lines, and Editors"


Languages

Language:Python 61.4%Language:Java 38.6%