Josh Smith's repositories
pandas
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
bing-search-sdk-for-python
Bing Search APIs SDK for python
text
Data loaders and abstractions for text and NLP
playingGod
A project to evolve synthesised sound through natural selection
keras-training
Prepatory work for text classification with Keras and Tensorflow, guided by F. Chollet
pytorch-processing
Creation of training, test and validation datasets for model training, using Pytorch
wikipick
A lightweight Python implementation of https://github.com/earwig/mwparserfromhell to read from Wikipedia
not-drowning
Visualise and play waves and music samples in browser. We hope.
latinum-bonds
A quick blockchain demo created for Demos at Latitude
polisServer
:nut_and_bolt: nuts and bolts of the system
polis-loadtesting
A series of Locust test to meaningfully load-test Pol.is instances
pygmalitron
A neural net learning to generate samples from simple waves
neuralnets_questionmark
Personal: Going through a tutorial to work out whether it's worth using neral nets for the next iteration of PlayingGod
topic-galaxy
Takes a free-text dataset and a words list, and outputs Gephi-mappable data allowing you to see relationships between those words, and the people using them.
words-in-tweets
Counts words in Tweets. That's... thats it.
whatsapp-parser
A lightweight script to convert WhatsApp .txt exports to .csv. NB: Developed for and tested on WhatsApp's 2018 export format
demos-space
A small static page for Demos Space
sheridan-site
Site for Sheridan Tongue.
soup-scrape
Playing around with JSoup to do some web scraping work
svg-icons
A storage bin for small, simple SVG icons I've created for various projects. Use away.
data-loop
Downloads data from a URL, runs it through a pre-built Qlikview structure and sends it via email.
docker-django
A dockerised generic django instance, with mysql and apache
twitter-image-cloud
Intention: Create what may or may not look like a cloud of images shared on Twitter, sized by the number of times they've been shared
sanitise-and-move
Sanitise all files in a directory, removing any characters from filenames which are illegal on Windows as well as problematic characters, then move them to another location, logging everything fully. This was written for Hogarth in 2013 as an archiving solution. Usage: Usage: -c, --casesensitive: For use on case sensitive filesystems. Default - off. -d, --dorename: Actually rename the files - otherwise just log and output to standard output. -h, --help: Print this help and exit. -l --logstashDir: A directory on the archive box containing a set of files sent by rsyslog to logstash. -r --renameLogDir: Directory, usually on the destination, for logs of files which have been renamed to be stored. -o, --oversizelog: Log to write files with overlong path names in - otherwise don't log. -p, --passdir: Directory to which clean files should be moved. -q, --quiet: Don't output to standard out. -t, --target: The location of the hot folder --temp-log-file: A file to write log information to
parallel_rsyncs
Uses gnuparallel and rsync to move data around, hopefully very quickly (Current project - in progress)
swisspy
Ever heard a collection of functions being described as a 'swiss army knife'? Well, it's a useful metaphor, so there.
template_unittests
A template for creating projects with unittests
move-by-regex
A laser guided system for moving files within structured directories (Projects servers and the like) which correspond to regex strings to a destination. Useful when archiving all jobs with a given number, but differing naming conventions.