[CLI]: API update delay
liuyixin-louis opened this issue · comments
Describe the bug
I have some runs finished with some config names, and when I used the wand API to retrieve them, I found the wand API was not able to get all the runs up to date. I wonder whether it has config to sync with the least status with the web app.
Additional Files
No response
Environment
WandB version:
OS:
Python version:
Versions of relevant libraries:
Additional Context
No response
Hi @liuyixin-louis! Thank you for writing in!
Could you please send me the workspace of the runs you are trying to pull up using the API? What field are you searching by? Could you also send the toy script of you using the api?
Hi, thanks for your reply; yes, I can share it; I masked some fields for privacy, but the structure is the same. So for me i find it can pull the right results after waiting for a while, ~10 min i remembered. The code mainly try to get the mean of some runs that have the same certain config.
dataset_name="XXX"
runs = api.runs("XXX", {"State": "finished",'config.dataset_name':dataset_name, "config.exp_name": "XXX"})
prompt2scoredict_dict = {}
import os
import numpy as np
import pickle as pkl
def all_reduce_metrics(runs, value =False):
for run in runs:
if 'restult_artifact_name' not in run.summary:
continue
art_name= run.summary['restult_artifact_name']
artifact = api.artifact(f'XXX'+art_name+":latest")
if not os.path.exists(artifact.file()):
artifact.download()
# file = None
with open(artifact.file(), 'rb') as f:
file = pkl.load(f)
prompt2scoredict_dict[run.config['instance_name']] = file['propmt2score']
reducer = {}
for instance in prompt2scoredict_dict:
for prompt in prompt2scoredict_dict[instance]:
for metric in prompt2scoredict_dict[instance][prompt]:
if metric not in reducer:
reducer[metric] = []
reducer[metric]+=prompt2scoredict_dict[instance][prompt][metric]
print(f'all reduce over {len(prompt2scoredict_dict)} instances')
if value:
return {k: np.mean(v) for k,v in reducer.items()}
return reducer
metrics = [
'M1',
'M2'
]
def reduce_over_one_variable(runs, control = 'note', metric_subset=None):
runs_note_unique_list = list(set([run.config[control] for run in runs]))
res_over_control = {k:{} for k in runs_note_unique_list}
runs_over_control = {k:[] for k in runs_note_unique_list}
for run in runs:
runs_over_control[run.config[control]].append(run)
for k in runs_note_unique_list:
print(k)
res_over_control[k] = all_reduce_metrics(runs_over_control[k], value = True)
if metric_subset is not None:
return {
k: {metric: res_over_control[k][metric] for metric in metric_subset} for k in res_over_control
}
reduce_over_one_variable(runs, control='note', metric_subset=metrics)
Thank you so much for sending it over @liuyixin-louis! That is very much appreciated and I will take a look.
How many runs per project do you currently have? Would you be able to share the workspace with me? It is strange that it takes a while to update. Are you running sweeps or regular wandb runs?
Also what version of wandb are you using?
@ArtsiomWB Thank you for your response. My current run number for this project is 3647. and yes i can share the workspace (code and wandb log), can you give me your wandb account or email? I was not using any sweeps feature i think it's just regular wandb runs. My wandb version is 0.16.6.
Totally!
Could you please send it to artsiom.skarakhod@wandb.com
Hi artsiom, so i found that the problem is gone these days. Thank for helping!