pyracing
Python horse racing class library
Installation
To install the latest release of pyracing to your current Python environment, execute the following command:
pip install pyracing
Alternatively, to install pyracing from a source distribution, execute the following command from the root directory of the pyracing repository:
python setup.py install
To install pyracing from a source distribution as a symlink for development purposes, execute the following command from the root directory of the pyracing repository instead:
python setup.py develop
Usage
To use pyracing, you must first import the pyracing package into your Python interpreter and initialize the pyracing module dependencies. pyracing depends on a database connection and a compatible horse racing web scraper.
The database connection must conform to the pymongo API, supporting code such as the following:
documents = database[collection_name].find(filter)
The web scraper must conform to the pypunters API, supporting code such as the following:
meets = scraper.scrape_meets(date)
pyracing has only been tested using pymongo and pypunters. To implement these dependencies using the same packages, execute the following code:
>>> import pymongo >>> database = pymongo.MongoClient()[database_name]>>> import cache_requests >>> http_client = cache_requests.Session() >>> from lxml import html >>> html_parser = html.fromstring >>> import pypunters >>> scraper = pypunters.Scraper(http_client, html_parser)
With these dependencies in place, the pyracing package can be imported and initialized as follows:
>>> import pyracing >>> pyracing.initialize(database, scraper)
Meets
A meet represents a collection of races occurring at a given track on a given date.
To get a list of meets occurring on a given date, call the Meet.get_meets_by_date method as follows:
>>> from datetime import datetime >>> date = datetime(2016, 2, 1) >>> meets = pyracing.Meet.get_meets_by_date(date)
The get_meets_by_date method will return a list of Meet objects. The Meet class itself is derived from Python's built-in dict type, so a meet's details can be accessed as follows:
>>> track = meets[index]['track']
Races
A race represents a collection of runners competing in a single event at a given meet.
To get a list of races occurring at a given meet, call the Race.get_races_by_meet method as follows:
>>> races = pyracing.Race.get_races_by_meet(meet)
Alternatively, a list of races occurring at a given meet can be obtained by accessing the meet's races property as follows:
>>> races = meet.races
The get_races_by_meet method will return a list of Race objects. The Race class itself is derived from Python's built-in dict type, so a race's details can be accessed as follows:
>>> number = races[index]['number']
To get the meet at which a given race occurs, access the race's meet property as follows:
>>> meet = races[index].meet
In addition, race objects expose an 'importance' property that returns the product of the starting prices of all runners in the race that finished in the first four, or 1.0 if such data is not yet available. The importance property is accessible using dot-notation as follows:
>>> importance = races[index].importance
Runners
A runner represents a combination of horse, jockey and trainer competing in a given race.
To get a list of runners competing in a given race, call the Runner.get_runners_by_race method as follows:
>>> runners = pyracing.Runner.get_runners_by_race(race)
Alternatively, a list of runners competing in a given race can be obtained by accessing the race's runners property as follows:
>>> runners = race.runners
The get_runners_by_race method will return a list of Runner objects. The Runner class itself is derived from Python's built-in dict type, so a runner's details can be accessed as follows:
>>> number = runners[index]['number']
To get the race in which a given runner competes, access the runner's race property as follows:
>>> race = runner[index].race
Runner objects also expose the following calculated values as properties that can be accessed using dot-notation:
Property | Description |
---|---|
runner.actual_weight | The weight carried by the runner plus the average weight of a racehorse (in kg) |
runner.age | The horse's official age as at the date of the race (calculated according to Australia standards) |
runner.carrying | The official listed weight for the runner less allowances (in kg) |
runner.current_performance | The horse's performance for the current race if available (None if not) |
runner.result | The final result achieved by this runner if available (None if not) |
runners.spell | The number of days since the horse's previous run (None if this is the horse's first run) |
runners.starting_price | The starting price for this runner if available (None if not) |
runner.up | The number of races run by the horse (including the this one) since the last spell of 90 days or more |
The following properties (also accessible using dot-notation) return PerformanceList objects (see below) containing a filtered list of the horse's prior performances:
Property | Description |
---|---|
runner.at_distance | All prior performances at a distance within 100m of the current race |
runner.at_distance_on_track | All prior performances at a distance within 100m of the current race on the same track |
runner.career | All performances prior to the current race |
runner.firm | All prior performances on FIRM tracks |
runner.good | All prior performances on GOOD tracks |
runner.heavy | All prior performances on HEAVY tracks |
runner.on_track | All prior performances on the current track |
runner.on_up | All prior performances with the same UP number as the horse's current run |
runner.since_rest | All performances since the horse's last spell of 90 days or more |
runner.soft | All prior performances on SOFT tracks |
runner.synthetic | All prior performances on SYNTHETIC tracks |
runner.with_jockey | All prior performances for the horse with the same jockey |
The following properties (also accessible using dot-notation) return PerformanceList objects (see below) containing a filtered list of the jockey's prior performances:
Property | Description |
---|---|
runner.jockey_at_distance | All prior performances at a distance within 100m of the current race |
runner.jockey_at_distance_on_track | All prior performances at a distance within 100m of the current race on the same track |
runner.jockey_career | All performances prior to the current race |
runner.jockey_firm | All prior performances on FIRM tracks |
runner.jockey_good | All prior performances on GOOD tracks |
runner.jockey_heavy | All prior performances on HEAVY tracks |
runner.jockey_on_track | All prior performances on the current track |
runner.jockey_soft | All prior performances on SOFT tracks |
runner.jockey_synthetic | All prior performances on SYNTHETIC tracks |
The PerformanceList objects returned by the properties described above expose the following properties:
Property | Description |
---|---|
average_momentum | The average momentum per start in the performance list (None if no starts) |
average_prize_money | The average prize money earned per start in the performance list (None if no starts) |
average_starting_price | The average starting price per start in the performance list (None if no starts) |
fourths | The number of fourth placing performances included in the performance list |
fourth_pct | The number of fourths as a percentage of the number of starts (None if no starts) |
maximum_momentum | The maximum momentum achieved for any performance in the performance list |
minimum_momentum | The minimum momentum achieved for any performance in the performance list |
places | The number of placing (1st, 2nd and 3rd) performances included in the performance list |
place_pct | The number of places as a percentage of the number of starts (None if no starts) |
roi | The total starting price for wins less the number of starts as a percentage of the number of starts (None if no starts) |
seconds | The number of second placing performances included in the performance list |
second_pct | The number of seconds as a percentage of the number of starts (None if no starts) |
starts | The total number of starts included in the performance list |
thirds | The number of third placing performances included in the performance list |
third_pct | The number of thirds as a percentage of the number of starts (None if no starts) |
total_prize_money | The total prize money earned in the performance list |
wins | The number of winning performances included in the performance list |
win_pct | The number of wins as a percentage of the number of starts (None if no starts) |
An example of accessing these statistics is given below:
>>> good_wins = runner.good.wins
Runner objects also provide a calculate_expected_speed method that will return a tuple of minimum, maximum and average expected speeds for the runner based on the runner's actual weight and the minimum, maximum and average momentums for a specified performance list, as follows:
>>> runner.calculate_expected_speed('career') (15.75, 17.25, 16.50)
Horses
A horse represents the equine component of a given runner.
To get the horse for a given runner, call the Horse.get_horse_by_runner method as follows:
>>> horse = pyracing.Horse.get_horse_by_runner(runner)
Alternatively, the horse for a given runner can be obtained by accessing the runner's horse property as follows:
>>> horse = runner.horse
The get_horse_by_runner method will return a single Horse object. The Horse class itself is derived from Python's built-in dict type, so a horse's details can be accessed as follows:
>>> name = horse['name']
Jockeys
A jockey represents the human riding a runner.
To get the jockey for a given runner, call the Jockey.get_jockey_by_runner method as follows:
>>> jockey = pyracing.Jockey.get_jockey_by_runner(runner)
Alternatively, the jockey for a given runner can be obtained by accessing the runner's jockey property as follows:
>>> jockey = runner.jockey
The get_jockey_by_runner method will return a single Jockey object. The Jockey class itself is derived from Python's built-in dict type, so a jockey's details can be accessed as follows:
>>> name = jockey['name']
Trainers
A trainer represents the people responsible for a horse.
To get the trainer for a given runner, call the Trainer.get_trainer_by_runner method as follows:
>>> trainer = pyracing.Trainer.get_trainer_by_runner(runner)
Alternatively, the trainer for a given runner can be obtained by accessing the runner's trainer property as follows:
>>> trainer = runner.trainer
The get_trainer_by_runner method will return a single Trainer object. The Trainer class itself is derived from Python's built-in dict type, so a trainer's details can be accessed as follows:
>>> name = trainer['name']
Performances
A performance represents the result of a completed run by a horse and jockey.
To get a list of performances for a given horse, call the Horse.get_performances_by_horse method as follows:
>>> performances = pyracing.Performance.get_performances_by_horse(horse)
Alternatively, a list of performances for a given horse can be obtained by accessing the horse's performances property as follows:
>>> performances = horse.performances
The get_performances_by_horse method will return a list of Performance objects. The Performance class itself is derived from Python's built-in dict type, so a performance's details can be accessed as follows:
>>> result = performances[index]['result']
Performance objects also expose the following calculated values as properties that can be accessed using dot-notation:
Property | Description |
---|---|
performance.actual_distance | The actual distance run by the horse in the winning time (in metres) |
performance.actual_weight | The weight carried by the horse plus the average weight of a racehorse (in kg) |
performance.momentum | The average momentum achieved by the horse (in kg m/s) |
performance.speed | The average speed run by the horse (in m/s) |
Batch Processing
The pyracing package includes a Processor class to facilitate the batch processing of ALL racing data for a specified date range.
To implement batch processing, extend the Processor class with your own custom sub-class and call its process_dates method as follows:
>>> custom_processor = CustomProcessor(threads=1, message_prefix='processing') >>> custom_processor.process_dates(date_from, date_to)
Alternatively, to process ALL racing data for a single date instead, call the process_date method as follows:
>>> custom_processor.process_date(date)
The threads and message_prefix arguments to the Processor constructor are both optional.
The threads argument specifies the number of threads to use for processing entities (all threads will be joined after processing a single date's data, just prior to executing the post_process_date method if specified - see below). The default value for threads is 1.
The message_prefix argument specifies a text string to be prepended to a description of each entity being processed in the messages logged by the processor. The default value for message_prefix is 'processing'.
Any combination of the following instance methods may be defined in a custom Processor class, with each being called at a specific time during the processing of entities:
Method | Calls | When |
---|---|---|
pre_process_date | pre_process_date(date) | BEFORE meets occurring on date are processed |
post_process_date | post_process_date(date) | AFTER meets occurring on date have been processed (and threads have been joined) |
pre_process_meet | pre_process_meet(meet) | BEFORE races occurring at meet are processed |
post_process_meet | post_process_meet(meet) | AFTER races occurring at meet have been processed |
pre_process_race | pre_process_race(race) | BEFORE runners competing in race are processed |
post_process_race | post_process_race(race) | AFTER runners competing in race have been processed |
pre_process_runner | pre_process_runner(runner) | BEFORE the runner's horse, jockey and trainer are processed |
post_process_runner | post_process_runner(runner) | AFTER the runner's horse, jockey and trainer have been processed |
pre_process_horse | pre_process_horse(horse) | BEFORE the horse's performances are processed |
post_process_horse | post_process_horse(horse) | AFTER the horse's performances have been processed |
process_jockey | process_jockey(jockey) | ONCE for each run by a jockey |
process_trainer | process_trainer(trainer) | ONCE for each run by a trainer |
process_performance | process_performance(performance) | ONCE for each performance by a horse |
Event Hooks
The pyracing package implements a publisher/subscriber style event model. To subscribe to an event, call the pyracing.add_subscriber method as follows:
>>> pyracing.add_subscriber('event_name', handler)
handler must be a function that conforms to the handler signature as specified in the following table:
Event Name | Calls | When |
---|---|---|
deleting_meet | handler(meet) | BEFORE meet is deleted from the database |
deleted_meet | handler(meet) | AFTER meet has been deleted from the database |
saving_meet | handler(meet) | BEFORE meet is saved to the database |
saved_meet | handler(meet) | AFTER meet has been saved to the database |
deleting_race | handler(race) | BEFORE race is deleted from the database |
deleted_race | handler(race) | AFTER race has been deleted from the database |
saving_race | handler(race) | BEFORE race is saved to the database |
saved_race | handler(race) | AFTER race has been saved to the database |
deleting_runner | handler(runner) | BEFORE runner is deleted from the database |
deleted_runner | handler(runner) | AFTER runner has been deleted from the database |
saving_runner | handler(runner) | BEFORE runner is saved to the database |
saved_runner | handler(runner) | AFTER runner has been saved to the database |
deleting_horse | handler(horse) | BEFORE horse is deleted from the database |
deleted_horse | handler(horse) | AFTER horse has been deleted from the database |
saving_horse | handler(horse) | BEFORE horse is saved to the database |
saved_horse | handler(horse) | AFTER horse has been saved to the database |
deleting_jockey | handler(jockey) | BEFORE jockey is deleted from the database |
deleted_jockey | handler(jockey) | AFTER jockey has been deleted from the database |
saving_jockey | handler(jockey) | BEFORE jockey is saved to the database |
saved_jockey | handler(jockey) | AFTER jockey has been saved to the database |
deleting_trainer | handler(trainer) | BEFORE trainer is deleted from the database |
deleted_trainer | handler(trainer) | AFTER trainer has been deleted from the database |
saving_trainer | handler(trainer) | BEFORE trainer is saved to the database |
saved_trainer | handler(trainer) | AFTER trainer has been saved to the database |
deleting_performance | handler(performance) | BEFORE performance is deleted from the database |
deleted_performance | handler(performance) | AFTER performance has been deleted from the database |
saving_performance | handler(performance) | BEFORE performance is saved to the database |
saved_performance | handler(performance) | AFTER performance has been saved to the database |
Testing
To run the included test suite, execute the following command from the root directory of the pyracing repository:
python setup.py test
The above command will ensure all test dependencies are installed in your current Python environment. For more concise output during subsequent test runs, the following command can be executed from the root directory of the pyracing repository instead:
nosetests
Alternatively, individual components of pyracing can be tested by executing any of the following commands from the root directory of the pyracing repository:
nosetests pyracing.test.meets nosetests pyracing.test.races nosetests pyracing.test.runners nosetests pyracing.test.horses nosetests pyracing.test.jockeys nosetests pyracing.test.trainers nosetests pyracing.test.performances nosetests pyracing.test.performance_lists nosetests pyracing.test.processor
Version History
- 0.4.0 (29 April 2016)
- Interim release to implement jockey statistics
- 0.3.0 (28 April 2016)
- Interim release to facilitate initial predictions
- 0.2.5 (27 April 2016)
- Fix ZeroDivisionErrors
- 0.2.4 (27 April 2016)
- Fix ValueErrors in PerformanceList
- 0.2.3 (26 April 2016)
- Fix TypeError in Runner.calculate_expected_speed
- 0.2.2 (26 April 2016)
- Fix memory leak in cached properties
- 0.2.1 (26 April 2016)
- Fix TypeErrors in calculated properties
- 0.2.0 (26 April 2016)
- Interim release to facilitate pre-seeding query data
- 0.1.1 (22 April 2016)
- Fix issue with caught exceptions hanging Processor
- 0.1.0 (21 April 2016)
- Interim release to facilitate database pre-population