Numerai Python API
Automatically download and upload data for the Numerai machine learning competition.
This library is a Python client to the Numerai API. The interface is programmed
in Python and allows downloading the training data, uploading predictions, and
accessing user and submission information. Some parts of the code were taken
from numerflow by ChristianSch.
Visit his
wiki,
if you need further information on the reverse engineering process.
If you encounter a problem or have suggestions, feel free to open an issue.
Installation
- Obtain a copy of this API
-
If you do not plan on contributing to this repository, download a release.
- Navigate to releases.
- Download the latest version.
- Extract with
unzip
ortar
as necessary.
-
If you do plan on contributing, clone this repository instead.
cd
into the API directory (defaults tonumerapi
, but make sure not to go into the sub-directory also namednumerapi
).pip install -e .
Usage
See example.py
. You can run it as ./example.py
Documentation
Layout
Parameters and return values are given with Python types. Dictionary keys are
given in quotes; other names to the left of colons are for reference
convenience only. In particular, list
s of dict
s have names for the dict
s;
these names will not show up in the actual data, only the actual dict
data
itself.
login
Parameters
email
(str
, optional): email of user account- will prompt for this value if not supplied
password
(str
, optional): password of user account- will prompt for this value if not supplied
- prompting is recommended for security reasons
prompt_for_mfa
(bool
, optional): indication of whether to prompt for MFA code- only necessary if MFA is enabled for user account
Return Values
user_credentials
(dict
): credentials for logged-in user"username"
(str
)"access_token"
(str
)"refresh_token"
(str
)
download_current_dataset
Parameters
dest_path
(str
, optional, default:.
): destination folder for the datasetunzip
(bool
, optional, default:True
): indication of whether the training data should be unzipped
Return Values
success
(bool
): indication of whether the current dataset was successfully downloaded
get_all_competitions
Return Values
all_competitions
(list
): information about all competitionscompetition
(dict
)"_id"
(int
)"dataset_id"
(str
)"start_date"
(str (datetime)
)"end_date"
(str (datetime)
)"paid"
(bool
)"leaderboard
" (list
)submission
(dict
)"concordant"
(dict
)"pending"
(bool
)"value"
(bool
)
"earnings"
(dict
)"career"
(dict
)"nmr"
(str
)"usd"
(str
)
"competition"
(dict
)"nmr"
(str
)"usd"
(str
)
"logloss"
(dict
)"consistency"
(int
)"validation"
(float
)
"original"
(dict
)"pending"
(bool
)"value"
(bool
)
"submission_id"
(str
)"username"
(str
)
get_competition
Return Values
competition
(dict
): information about requested competition_id
(int
)"dataset_id"
(str
)"start_date"
(str (datetime)
)"end_date"
(str (datetime)
)"paid"
(bool
)"leaderboard"
(list
)submission
(dict
)"concordant"
(dict
)"pending"
(bool
)"value"
(bool
)
"earnings"
(dict
)"career"
(dict
)"nmr"
(str
)"usd"
(str
)
"competition"
(dict
)"nmr"
(str
)"usd"
(str
)
"logloss"
(dict
)"consistency"
: (int)
"validation": (float
)"original"
(dict
)"pending"
(bool
)"value"
(bool
)
"submission_id"
(str
)"username"
(str
)
get_earnings_per_round
Parameters
username
: user for which earnings are requested
Return Values
round_ids
(np.ndarray(int)
): IDs of each round for which there are earningsearnings
(np.ndarray(float)
): earnings for each round
get_scores_for_user
Parameters
username
: user for which scores are being requested
Return Values
validation_scores
(np.ndarray(float)
): logloss validation scoresconsistency_scoress
(np.ndarray(float)
): logloss consistency scoresround_ids
(np.ndarray(int
): IDs of the rounds for which there are scores
get_user
Parameters
username
:str
- name of requested user
Return Values
user
(dict
): information about the requested user"_id"
(str
)"username"
(str
)"assignedEthAddress"
(str
)"created"
(str (datetime)
)"earnings"
(float
)"followers"
(int
)"rewards"
(list
)reward
(dict
)"_id"
(int
)"amount"
(float
)"earned"
(float
)"nmr_earned"
(str
)"start_date"
(str (datetime)
)"end_date"
(str (datetime)
)
"submissions"
(dict
)"results"
(list
)result
(dict
)"_id"
(str
)"competition"
(dict
)"_id"
(str
)"start_date"
(str (datetime)
)"end_date"
(str (datetime)
)
"competition_id"
(int
)"created"
(str (datetime)
)"id"
(str
)"username"
(str
)
get_submission_for_round
Parameters
username
(str
): user for which submission is requestedround_id
(int
, optional): round for which submission is requested- if no
round_id
is supplied, the submission for the current round will be retrieved
- if no
Return Values
username
(str
): user for which submission is requestedsubmission_id
(str
): ID of submission for which data was foundlogloss_val
(float
): amount of logloss for given submissionlogloss_consistency
(float
): consistency of given submissioncareer_usd
(float
): amount of USD earned by given usercareer_nmr
(float
): amount of NMR earned by given userconcordant
(bool
ORdict
(see note)): whether given submission is concordant- for rounds before 64, this was only a boolean, but from 64 on, it is a dict which indicates whether concordance is still being computed
original
(bool
ORdict
(see note)): whether given submission is original- for rounds before 64, this was only a boolean, but from 64 on, it is a dict which indicates whether originality is still being computed
upload_predictions
Parameters
file_path
(str
): path to CSV of predictions- should already contain the file name (e.g.
"path/to/file/prediction.csv"
)
- should already contain the file name (e.g.
Return Values
success
: indicator of whether the upload succeeded
Notes
- Uploading a prediction shortly before a new dataset is released may result in
a
400 Bad Request
. If this happens, wait for the new dataset and attempt to upload again. - Uploading too many predictions in a certain amount of time will result in a
429 Too Many Requests
.