# Benchmark Revamping
## Requirements
* The benchmark data retrieval should be separated from zipline
* The benchmark definition should enable the usage of different benchmarks \(Indexes, ETFs, different geographies\)
* No performance regression
## Solution
The option to make the benchmark optional is coherent with the fact that zipline is a specialised backtester \(it won't impact your need at Quantopian to see the performance results in realtime\). If the end user wants to analyse his strategy's performance, he can use Alphalens \(compare the returns to the returns from a specific benchmark instrument or factor\).
I can do the following:
1. For tomorrow evening, I can write a simple WA to show the end users how to download the data from IEX and add them in Zipline without the need to amend the benchmark.py by manually saving the returns in zipline data root folder via simple pd transformations. I'll push this change to the readme file after it's accepted
2. For this weekend, I can add an option to zipline to ask the end users to provide the benchmark closing prices in Json/CSV \(in OHLCV\) format with the path to the file. We can provide the end users with sample code on how to download the data from IEX to match the format required as a way to download the data
### Backtesting with Benchmark Data in CSV/JSON
* Add a main command to use a custom benchmark file that contains at least \(datetime, close\)
```python
@main.command()
@click.option(
'-bf',
'--benchmarkfile',
default=None,
type=click.File('r'),
help='The csv file that contains the benchmark closing prices',
)
```
* Allow as well to use a benchmark from the ingested data via its SID \(or equivalent\)
* If the benchmark is not given, try to retrieve the default data already saved in data\_root that has been saved.
* If the benchmarks are empty or forced to None move to the next section
### Backtesting without Benchmark
* Pass a zero dataframe to not change zipline's core functionalities
* Raise a warning on the metrics valuation
## Validation
### Configuration
Let us consider two instruments **A** \(a\) and instrument **B** \(b\) to be used as a benchmark :
#### Data file for the instrument A a.csv:
```text
date,open,high,low,close,adj_close,volume
2020-01-02 00:00:00+00:00,100,100,100,100,100,10000
2020-01-03 00:00:00+00:00,120,120,120,120,120,12000
2020-01-06 00:00:00+00:00,100,100,100,100,100,10000
2020-01-07 00:00:00+00:00,160,160,160,160,160,16000
2020-01-08 00:00:00+00:00,180,180,180,180,180,18000
2020-01-09 00:00:00+00:00,200,200,200,200,200,20000
```
**Data file for the benchmark B b.csv:**
```python
date,open,high,low,close,adj_close,volume
2020-01-02 00:00:00+00:00,100,100,100,100,100,10000
2020-01-03 00:00:00+00:00,90,90,90,90,90,9000
2020-01-06 00:00:00+00:00,120,120,120,120,120,10000
2020-01-07 00:00:00+00:00,140,140,140,140,140,14000
2020-01-08 00:00:00+00:00,160,160,160,160,160,16000
2020-01-09 00:00:00+00:00,180,180,180,1180,180,18000
```
#### Ingestion Configuration
```python
import pandas as pd
from zipline.data.bundles import register
from zipline.data.bundles.csvdir import csvdir_equities
start_session2 = pd.Timestamp('2020-01-02', tz='utc')
end_session2 = pd.Timestamp('2020-01-09', tz='utc')
register(
'csv-xpar-sample',
csvdir_equities(
['daily'],
'/Users/aennassiri/opensource/zipline',
),
calendar_name='XPAR',
start_session=start_session2,
end_session=end_session2
)
```
#### Sample Algorithm
The algorithm is supposed to order at the beginning of the backtesting period 1000 **A** stocks
```python
from zipline.api import order, symbol
from zipline.finance import commission, slippage
def initialize(context):
context.stocks = symbol('a')
context.has_ordered = False
context.set_commission(commission.NoCommission())
context.set_slippage(slippage.NoSlippage())
def handle_data(context, data):
if not context.has_ordered:
order(context.stocks, 1000)
context.has_ordered = True
```
### Tests
#### Run without any benchmark option
**Command:**
```python
run -f TEST_FOLDER/test_benchmark3.py -b csv-xpar-sample -s 01/01/2020 -e 01/09/2020
```
**Result**
```bash
Warning: Neither a benchmark file nor a benchmark symbol is provided. Trying to use the default benchmark loader. To use zero as a benchmark, use the flag --no-benchmark
...
ValueError: Please set your IEX_API_KEY environment variable and retry.
Please note that this feature will be deprecated
```
**Comment**
If no benchmark setting is used, we use the default benchmark data loader after raising a warning. The latter checks if data already exists in the zipline data folder, if not it tries to download it from IEX as before. I've made a change in the code to enable users who set an environment variable with the IEX\_API\_KEY code.
This option is kept as to not break the backward compatibility for users who put directly the benchmark data in the zipline data folder.
**TODO:**
I'm going to update the error message:
* Advise the end user to prefer using the explicit benchmark options provided
* Tell the end user that it is possible to directly put the benchmark data in the data folder
* Inform the end user that it is possible to download the data from IEX, if the env variable IEX\_API\_KEY code though warn him that the benchmark SPY is going to be used
#### Run with benchmark file option
**Command:**
```python
run -f TEST_FOLDER/test_benchmark3.py -b csv-xpar-sample -s 01/01/2020 -e 01/09/2020 --benchmark-file TEST_FOLDER/data/daily/b.csv --trading-calendar XPAR
```
**Result**
```bash
[2020-02-07 10:19:55.904112] INFO: zipline.finance.metrics.tracker: Simulated 6 trading days
first open: 2020-01-02 08:01:00+00:00
last close: 2020-01-09 16:30:00+00:00
algo_volatility algorithm_period_return alpha \
2020-01-02 16:30:00+00:00 NaN 0.000 NaN
2020-01-03 16:30:00+00:00 0.000000 0.000 0.000000
2020-01-06 16:30:00+00:00 0.018330 -0.002 -0.070705
2020-01-07 16:30:00+00:00 0.055083 0.004 0.268001
2020-01-08 16:30:00+00:00 0.048217 0.006 0.310527
2020-01-09 16:30:00+00:00 0.043427 0.008 0.299573
...
```
Order at trading day 2 :
```python
[{'price': 120.0, 'amount': 1000, 'dt': Timestamp('2020-01-03 16:30:00+0000', tz='UTC'), 'sid': Equity(0 [A]), 'order_id': '8b3d018994cf43db960e2943b59f7ef0', 'commission': None}]
```
| File | algo\_volatility | algorithm\_period\_return | alpha | benchmark\_period\_return | benchmark\_volatility | beta | capital\_used |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 2020-01-02 16:30:00+00:00 | | 0.0 | | 0.0 | | | 0.0 |
| 2020-01-03 16:30:00+00:00 | 0.0 | 0.0 | 0.0 | -0.09999999999999998 | 1.1224972160321822 | 0.0 | -120000.0 |
| 2020-01-06 16:30:00+00:00 | 0.018330302779823376 | -0.0020000000000000018 | -0.07070503597122309 | 0.19999999999999996 | 3.6018513757973594 | -0.004964028776978422 | 0.0 |
| 2020-01-07 16:30:00+00:00 | 0.05508274964812368 | 0.0040000000000000036 | 0.26800057257371374 | 0.40000000000000013 | 3.0243456592570017 | -0.0006048832358595375 | 0.0 |
| 2020-01-08 16:30:00+00:00 | 0.048217028714915476 | 0.006000000000000005 | 0.3105268451522164 | 0.6000000000000001 | 2.6367729194171097 | -0.0002895623813476541 | 0.0 |
| 2020-01-09 16:30:00+00:00 | 0.043427366958536835 | 0.008000000000000007 | 0.34104117177313514 | 0.8 | 2.3608034346685574 | -0.0001915086325657816 | 0.0 |
**Comment**
The benchmark data from the provided file is correctly loaded.
#### Run with benchmark symbol option
**Command:**
```python
run -f TEST_FOLDER/test_benchmark3.py -b csv-xpar-sample -s 01/01/2020 -e 01/09/2020 --benchmark-symbol b --trading-calendar XPAR
```
**Result**
```bash
[2020-02-07 10:28:00.235496] INFO: zipline.finance.metrics.tracker: Simulated 6 trading days
first open: 2020-01-02 08:01:00+00:00
last close: 2020-01-09 16:30:00+00:00
algo_volatility algorithm_period_return alpha \
2020-01-02 16:30:00+00:00 NaN 0.000 NaN
2020-01-03 16:30:00+00:00 0.000000 0.000 0.000000
2020-01-06 16:30:00+00:00 0.018330 -0.002 -0.070705
2020-01-07 16:30:00+00:00 0.055083 0.004 0.268001
2020-01-08 16:30:00+00:00 0.048217 0.006 0.310527
2020-01-09 16:30:00+00:00 0.043427 0.008 0.341041
```
Order at trading day 2 :
```python
[{'amount': 1000, 'sid': Equity(0 [A]), 'dt': Timestamp('2020-01-03 16:30:00+0000', tz='UTC'), 'price': 120.0, 'order_id': '18d3e8ab70be4cf392b2f8e044e3680d', 'commission': None}]
```
| Symbol | algo\_volatility | algorithm\_period\_return | alpha | benchmark\_period\_return | benchmark\_volatility | beta | capital\_used |
| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
| 2020-01-02 16:30:00+00:00 | | 0.0 | | 0.0 | | | 0.0 |
| 2020-01-03 16:30:00+00:00 | 0.0 | 0.0 | 0.0 | -0.09999999999999998 | 1.1224972160321822 | 0.0 | -120000.0 |
| 2020-01-06 16:30:00+00:00 | 0.018330302779823376 | -0.0020000000000000018 | -0.07070503597122309 | 0.19999999999999996 | 3.6018513757973594 | -0.004964028776978422 | 0.0 |
| 2020-01-07 16:30:00+00:00 | 0.05508274964812368 | 0.0040000000000000036 | 0.26800057257371374 | 0.40000000000000013 | 3.0243456592570017 | -0.0006048832358595375 | 0.0 |
| 2020-01-08 16:30:00+00:00 | 0.048217028714915476 | 0.006000000000000005 | 0.3105268451522164 | 0.6000000000000001 | 2.6367729194171097 | -0.0002895623813476541 | 0.0 |
| 2020-01-09 16:30:00+00:00 | 0.043427366958536835 | 0.008000000000000007 | 0.34104117177313514 | 0.8 | 2.3608034346685574 | -0.0001915086325657816 | 0.0 |
**Comment**
The benchmark data from the provided file is correctly loaded and matches the results from the test with benchmark\_file.
If the benchmark data symbol is not found, a warning is raised and the default loader is used as a contingency plan:
```python
/Users/aennassiri/opensource/zipline/zipline/utils/run_algo.py:116: UserWarning: Symbol c as a benchmark not found in this bundle. Proceedig with default benchmark loader
"loader" % benchmark_symbol)
[2020-02-07 10:32:49.049269] INFO: Loader: Cache at /Users/aennassiri/.zipline/data/SPY_benchmark.csv does not have data from 1990-01-02 00:00:00+00:00 to 2020-02-07 00:00:00+00:00.
[2020-02-07 10:32:49.049469] INFO: Loader: Downloading benchmark data for 'SPY' from 1989-12-29 00:00:00+00:00 to 2020-02-07 00:00:00+00:00
```
**TODO:**
* Correct the warning message
* I did the validation with the Paris Calendar and wanted to check how the system behaves when the trading calendar is given in the ingested data and is different from the one used for running the algorithm. I need to review how the system booked an order at 21:30 knowing that the ingested data is from a different calendar. This issue is independent from the benchmark validation
#### Run with --no-benchmark option
**Command:**
```python
run -f TEST_FOLDER/test_benchmark3.py -b csv-xpar-sample -s 01/01/2020 -e 01/09/2020 --no-benchmark
```
**Result**
```bash
Warning: Using zero returns as a benchmark. The risk metrics that requires benchmark returns will not be calculated.
[2020-02-07 10:38:07.174387] INFO: zipline.finance.metrics.tracker: Simulated 6 trading days
first open: 2020-01-02 14:31:00+00:00
last close: 2020-01-09 21:00:00+00:00
algo_volatility algorithm_period_return alpha \
2020-01-02 21:00:00+00:00 NaN 0.000 None
2020-01-03 21:00:00+00:00 0.000000 0.000 None
2020-01-06 21:00:00+00:00 0.018330 -0.002 None
2020-01-07 21:00:00+00:00 0.055083 0.004 None
2020-01-08 21:00:00+00:00 0.048217 0.006 None
2020-01-09 21:00:00+00:00 0.043427 0.008 None
benchmark_period_return benchmark_volatility beta \
2020-01-02 21:00:00+00:00 0.0 None None
2020-01-03 21:00:00+00:00 0.0 [0.0] None
2020-01-06 21:00:00+00:00 0.0 [0.0] None
2020-01-07 21:00:00+00:00 0.0 [0.0] None
2020-01-08 21:00:00+00:00 0.0 [0.0] None
2020-01-09 21:00:00+00:00 0.0 [0.0] None
```
```python
[{'price': 120.0, 'order_id': 'b607afab7c674f22a303a0a483f00a31', 'amount': 1000, 'sid': Equity(0 [A]), 'commission': None, 'dt': Timestamp('2020-01-03 21:00:00+0000', tz='UTC')}]
```
**Comment**
* A warning states that the system is using zero returns as a benchmark.
* All the results are the same except between the last three runs except for Alpha \(None\), Beta\(None\), benchmark\_returns \(Zero\), benchmark\_volatility\(Zero\).
### **Appendix**
The comparison and reconciliation of the returns, volatility, alpha, beta can be found in this [sheet](https://docs.google.com/spreadsheets/d/1-Zl8fYPAH6k9dvhAUJ2iAZaKTTFm62Hybffp1t8cSfQ/edit?usp=sharing)