- tags
python, shell
- author
Roland Smith
This is a collection of small utilities that I've written over the years. Some of them are simple front-ends for a utility with some standard options, to save me from having to recall the options every time I need them.
Another portion are basically Python front-ends to run a utility in parallel on different files.
All the functions in the python scripts come with documentation strings to explain what they do. The shell scripts have comments where necessary. They use basic sh
syntax and to not use bash
extensions.
All these programs are tested and in use on the FreeBSD operating system. The shell-scripts use the plain old sh
that comes with FreeBSD, but should work with bash
. Bug reports and patches welcome. Most of it should work on other BSD systems, Linux or OS-X without major problems.
The following scripts use Python 3.x specific features (like os.cpu_count
, subprocess.DEVNULL
and concurrent.futures
);
- checkfor.py
- dicom2jpg.py
- dicom2png.py
- dvd2webm.py
- foto4lb.py
- git-check-all.py
- gitdates.py
- img4latex.py
- make-flac.py
- make-mp3.py
- missing-libs.py
- tifftopdf.py
- vid2mkv.py
- vid2mp4.py
Other Python scripts are written on Python 3 but could be usable on Python 2.x with some changes. The Python scripts should work on other BSD systems, Linux and OS-X. They might work on MS-windows as well, provided that the external programs and modules they use are available. This has not been tested, however. Patches welcome.
All of these programs are in the public domain. Use them as you wish. See LICENSE.txt for the full text of the license.
Backs up mount points to other mount points. This script is designed to be run from cron as root
.
Note
You should not run this script as-is!
Change the __mkbackup
calls at the end of the script to reflect your situation.
This is more of a snippet of Python code. It provides a function called checkfor
that detects the availability of a required program. It is designed to be called from the main part of a script and terminates the script if the required program is not found.
This script removes several types of generated files from the directory it is called from.
Prints the time in several timezones that interest me in my locale. You should probably change the timezone and locale to suit your preferences.
Convert a CSV file to a LaTeX table.
This script reads /var/log/security or any other file that contains ipfw log messages, and makes an overview of incoming packages that have been logged.
This of course requires that blocked packets are logged!
If you are writing your own firewall script, make sure to use deny log
instead of just deny
.
A modification of the dicom2png
program mentioned below to produce JPEG output. This is meant for situaties where lossy compression is acceptable.
Convert DICOM files from an x-ray machine to PNG format, remove blank areas. The blank area removal is based on the image size of a Philips flat detector. The image goes from 2048x2048 pixels to 1574x2048 pixels.
Since version 1.1.0, this program requires the py-wand library, which in turn requires the ImageMagick shared library libMagickWand-6
. Previous versions used the convert
program from ImageMagick directly.
Multiple images are processed in parallel using a ProcessPoolExecutor
from the concurrent.futures
module, using as many worker processes as your CPU has cores. This number is determined by the os.cpu_count
function, so this program requires at least Python 3.4.
When I buy DVDs, I generally transfer their contents to my computer for easier viewing. However, the video and audio format used on DVD is not very compact. So I tend to use ffmpeg to convert it to smaller formats without losing quality. As of 2016, my favorite storage format is a webm container with a VP9 video stream and vorbis audio.
Initially I used the simple webm.sh
script mentioned below. This had some shortcomings. It does not crop the video and cannot incorporate subtitles. It does enable multiple quality setting, but I seldomly used those.
The dvd2webm.py
script performs a 2-pass encoding in constrained quality mode. Optionally it also adds subtitles to the video, and starts from an offset.
Small helper script to start mutt in an urxvt terminal for a mailto
link.
Front-end for find to locate all files under the current directory that have been modified up to a given number of days ago.
Corrects the BoundingBox
for single-page PostScript documents. It requires the ghostscript program.
Scales fotos for including them into LaTeX documents. The standard configuration sets the width to 886 pixels and sets the resolution to 300 dpi. This gives an image 75 mm (about 3 in) wide.
Generates a backup of the directory it is called from in the form of a tar-file. The name of the backup file generally consists of;
- the word
backup
, - the date in the form YYYYMMDD,
- the short hash-tag if the directory is managed by git.
These parts are separated by dashes, and the file gets the .tar
extension. It requires the tar
program. Tested with FreeBSD's tar. Should work with GNU tar as long as you don't use the -x
option; the exclude syntax is different between BSD tar and GNU tar.
Generates an old-fashioned one-time pad; 65 lines of 12 groups of 5 random capital letters. Each pad has a header line containing a random identifier. It was inspired by reading Neal Stephenson's Cryptonomicon.
It uses random numbers from the operating system via Python's os.urandom
function.
A partial example:
+++++ KWSNKYJLFF +++++
01 WAGGB HJVHQ TTQPD LQUMD KFRFS GGCKA SVLLA WEUCS HTXNI DITNW RBZKM SEGGW
02 GDSBB XECBL AUVLQ TUDPO DTXKW MWGAV DLRXT NRYAH HTGII YXEJJ JLNRC BIVDX
03 JDQUJ QPAUT CUEHN RHIHT QYBGV WOVAQ MKVZQ WPRGL QJAVA RPLRS AXIII FKLEP
04 WXYAD JNSAQ LBRXE QLCUX ZCLIE WPHSO OZBNH ZQLVN FAUEZ IDAJY VPQJN WVCAD
05 BEYRE WORKU CPEGE JKKWZ XUVYU WSZXQ NOULH QOFDQ PREMG YJBIT GMOAM USKLV
06 ZVATP YSRWH EEQDV LIPVQ FVYSY CIICG JKMOA RFJYE RUDJG HHJXI NNPNU VERMN
07 WAHFD WGGGN GHIUM BCJNN CVBCK QXYGZ PEYLW XOGMT SJFQJ NWEBE BFBPJ IDHDB
08 NPPEG HNONE YCJTG BFSFA NFYUR CMCGD XSKRO NSRBX WSDDX MEMLX BBMLC IMDJL
09 PZNAK OCOXA PEGNL UAWQW YCVDM WBNZZ YQICH MTLBG LDQTW TQMCS KUYBN RUNXT
...
My impression is that the random data device on FreeBSD is pretty good;
> ./ent -u
ent -- Calculate entropy of file. Call
with ent [options] [input-file]
Options: -b Treat input as a stream of bits
-c Print occurrence counts
-f Fold upper to lower case letters
-t Terse output in CSV format
-u Print this message
By John Walker
http://www.fourmilab.ch/
January 28th, 2008
> dd if=/dev/random of=rdata.bin bs=1K count=1K
1024+0 records in
1024+0 records out
1048576 bytes transferred in 0.086200 secs (12164455 bytes/sec)
> ./ent rdata.bin
Entropy = 7.999857 bits per byte.
Optimum compression would reduce the size
of this 1048576 byte file by 0 percent.
Chi square distribution for 1048576 samples is 208.12, and randomly
would exceed this value 98.57 percent of the times.
Arithmetic mean value of data bytes is 127.5057 (127.5 = random).
Monte Carlo value for Pi is 3.137043522 (error 0.14 percent).
Serial correlation coefficient is 0.000771 (totally uncorrelated = 0.0).
According to the manual page, Wikipedia and other sources I could find the FreeBSD random device is intended to provide cryptographically secure pseudorandom data.
Generates random passwords. Like genotp
, It uses random numbers from the operating system via Python's os.urandom
function and converts them to text using base64 encoding. On FreeBSD I think this is secure enough given the previous section.
An example:
> python3 genpw.py -l 24 -g 4
BU_7 7RcI jjce zAKo 83v8 RAk_
Find all directories in the user's home directory that are managed with git, and run git gc
on them unless they have uncommitted changes.
For all command-line arguments, print out when they were first checked into git
.
For each file in a directory managed by git, get the short hash and data of the most recent commit of that file.
Makes a histogram of the bytes in each input file, and calculates the entropy in each file.
A program to check a PDF, PNG or JPEG file and return a suitable LaTeX figure environment for it.
Since version 1.2, this program requires the py-wand library, which in turn requires the ImageMagick shared library libMagickWand-6
. Previous versions used the identify
program from ImageMagick directly.
This program also requires the ghostscript interpreter to determine the size of PDF files.
As of version 1.4 it reads the text block width and height in mm from an INI-style configuration file named ~/.img4latexrc
. A valid example is shown below.
[size]
width = 100
height = 200
The image is scaled so that it fits within the text block. If a bitmapped image does not have a defined resolution, 300 pixels/inch is assumed.
Lock down files or directories.
This makes files read-only for the owner and inaccessible for the group and others. Then it sets the user immutable and user undeletable flag on the files. For directories, it recursively treats the files as mentioned above. It then sets the sets the directories to read/execute only for the owner and inaccessible for the group and others. Then it sets the user immutable and undeletable flag on the directories as well.
Using the -u flag unlocks the files or directories, making them writable for the owner only.
As usual, I wrote this to automate and simplify something that I was doing on a regular basis; safeguarding important but not often changed files.
The os.chflags function that is used in this script is only available on UNIX-like operating systems. So this doesn't work on ms-windows.
Encodes WAV files from cdparanoia to FLAC format. Processing is done in parallel using as many subprocesses as the machine has cores. Album information is gathered from a text file called album.json
.
This file has the following format:
{
"title": "title of the album",
"artist": "name of the artist",
"year": 1985,
"genre": "rock",
"tracks": [
"foo",
"bar",
"spam",
"eggs"
]
}
Works like make-flac.py
but uses lame to encode to variable bitrate MP3 files. It uses the same album.json
file as make-flac.
Use montage
from the ImageMagick suite to create an index picture of all the files given on the command-line.
Use convert
from the ImageMagick suite to convert scanned images to PDF files.
It assumes that images are scanned at 150 PPI, and the target page is A4.
Replaces whitespace in filenames with underscores.
Reads an SRT file and applies the given offset to all times in the file. This time-shifts all subtitles.
Renames a directory by prefixing the name with old-
, unless that directory already exists. If the directory name starts with a period, it removes the period and prefixes it with old-dot
.
This Python script is a small helper to open files from the command line. It was inspired by a OS X utility of the same name.
A lot of my interaction with the files on my computers is done through a command-line shell, even though I use the X Window System. One of the things I like about the gvim
editor is that it forks and detach from the shell it was started from. With other programs one usually has to explicitly add an &
to the end of the command.
Then I read about the OS X open program, and I decided to write a simple program like it in Python.
The result is open.py
. Note that it is pretty simple. and the programs that is uses to open files are geared towards common use. So text files are opened in an editor, while photos and most other types are opened in a viewer. This simplicity by design. It has no options and it only opens files and directories. I have no intention of it becoming like OS X's open or plan9's plumb.
This utility requires the python-magic module.
The filetypes
and othertypes
dictionaries in the beginning of this script should be changed to suit your preferences.
Select consecutive pages from a PDF document and put them in a separate document. Requires ghostscript.
Rewrite a PDF file using ghostscript.
Front-end for POV-ray with a limited amount of choices for picture size and quality.
List or set the __version__
string in all Python files given on the command line or recursively in all directories given on the command line.
Renames files given on the command line to <prefix><number>, keeping the extension of the original file. Example:
> ls
img_3240.jpg img_3246.jpg img_3252.jpg img_3258.jpg img_3264.jpg
img_3271.jpg img_3277.jpg img_3241.jpg img_3247.jpg img_3253.jpg
img_3259.jpg img_3265.jpg img_3272.jpg img_3278.jpg img_3242.jpg
img_3248.jpg img_3254.jpg img_3260.jpg img_3266.jpg img_3273.jpg
img_3279.jpg img_3243.jpg img_3249.jpg img_3255.jpg img_3261.jpg
img_3267.jpg img_3274.jpg img_3280.jpg img_3244.jpg img_3250.jpg
img_3256.jpg img_3262.jpg img_3269.jpg img_3275.jpg img_3245.jpg
img_3251.jpg img_3257.jpg img_3263.jpg img_3270.jpg img_3276.jpg
> rename -p holiday2014- -w 3 img_32*
> ls
holiday2014-001.jpg holiday2014-009.jpg holiday2014-017.jpg
holiday2014-025.jpg holiday2014-033.jpg holiday2014-002.jpg
holiday2014-010.jpg holiday2014-018.jpg holiday2014-026.jpg
holiday2014-034.jpg holiday2014-003.jpg holiday2014-011.jpg
holiday2014-019.jpg holiday2014-027.jpg holiday2014-035.jpg
holiday2014-004.jpg holiday2014-012.jpg holiday2014-020.jpg
holiday2014-028.jpg holiday2014-036.jpg holiday2014-005.jpg
holiday2014-013.jpg holiday2014-021.jpg holiday2014-029.jpg
holiday2014-037.jpg holiday2014-006.jpg holiday2014-014.jpg
holiday2014-022.jpg holiday2014-030.jpg holiday2014-038.jpg
holiday2014-007.jpg holiday2014-015.jpg holiday2014-023.jpg
holiday2014-031.jpg holiday2014-039.jpg holiday2014-008.jpg
holiday2014-016.jpg holiday2014-024.jpg holiday2014-032.jpg
holiday2014-040.jpg
This is just a collection of tests for functions from the different Python scripts.
Start a git daemon
for every directory under the current working directory that is under git control.
Set the title of the current terminal window to the hostname or to the first argument given on the command line.
Sets the resolution of pictures to the provided value in dots per inch. Uses the mogrify
program from the ImageMagick suite.
A utility written in pure Python to calculate the SHA-256 checksum of files, for systems that don't come with such a utility.
This small shell script find Opentype fonts in my TeXlive installation and installs symbolic links to those font files in a single directory. This directory is then scanned by fc-cache to make the fonts available to all programs that use fontconfig.
Convert TIFF files to PDF format using the utilities tiffinfo
and tiff2pdf
from the libtiff package.
Changes the names of all the files that it is given on the command-line to lower case.
Convert all video files given on the command line to theora / vorbis streams in a matroška container using ffmpeg. As of 3452c8a it uses a ThreadPoolExecutor
.
Analogue to vid2mkv.py
, but converts to H.264 (using the x264 encoder) / AAC streams in an MP4 container.
Convert video files to VP9 video and Vorbis audio streams in a webm container, using a 2-pass process.