wandb / wandb

🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.

Home Page:https://wandb.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[CLI]: API update delay

liuyixin-louis opened this issue · comments

Describe the bug

I have some runs finished with some config names, and when I used the wand API to retrieve them, I found the wand API was not able to get all the runs up to date. I wonder whether it has config to sync with the least status with the web app.

Additional Files

No response

Environment

WandB version:

OS:

Python version:

Versions of relevant libraries:

Additional Context

No response

Hi @liuyixin-louis! Thank you for writing in!

Could you please send me the workspace of the runs you are trying to pull up using the API? What field are you searching by? Could you also send the toy script of you using the api?

Hi, thanks for your reply; yes, I can share it; I masked some fields for privacy, but the structure is the same. So for me i find it can pull the right results after waiting for a while, ~10 min i remembered. The code mainly try to get the mean of some runs that have the same certain config.

dataset_name="XXX"
runs = api.runs("XXX", {"State": "finished",'config.dataset_name':dataset_name, "config.exp_name": "XXX"})
prompt2scoredict_dict = {}
import os 
import numpy as np 
import pickle as pkl
def all_reduce_metrics(runs, value =False):
    for run in runs:
        if 'restult_artifact_name' not in run.summary:
            continue
        art_name=  run.summary['restult_artifact_name']
        artifact = api.artifact(f'XXX'+art_name+":latest")
        if not os.path.exists(artifact.file()):
            artifact.download()
        
        # file = None
        with open(artifact.file(), 'rb') as f:
            file = pkl.load(f)
        prompt2scoredict_dict[run.config['instance_name']] = file['propmt2score']
    reducer = {}
    for instance in prompt2scoredict_dict:
        for prompt in prompt2scoredict_dict[instance]:
            for metric in prompt2scoredict_dict[instance][prompt]:
                if metric not in reducer:
                    reducer[metric] = []
                reducer[metric]+=prompt2scoredict_dict[instance][prompt][metric]
    print(f'all reduce over {len(prompt2scoredict_dict)} instances')
    if value:
        return {k: np.mean(v) for k,v in reducer.items()}     
    return reducer
metrics = [
    'M1',
    'M2'
    ]
def reduce_over_one_variable(runs, control = 'note', metric_subset=None):
    runs_note_unique_list = list(set([run.config[control] for run in runs]))
    res_over_control = {k:{} for k in runs_note_unique_list}
    runs_over_control = {k:[] for k in runs_note_unique_list}
    for run in runs:
        runs_over_control[run.config[control]].append(run)
    for k in runs_note_unique_list:
        print(k)
        res_over_control[k] = all_reduce_metrics(runs_over_control[k], value = True)
    if metric_subset is not None:
        return {
            k: {metric: res_over_control[k][metric] for metric in metric_subset} for k in res_over_control
        }
reduce_over_one_variable(runs, control='note', metric_subset=metrics)

Thank you so much for sending it over @liuyixin-louis! That is very much appreciated and I will take a look.

How many runs per project do you currently have? Would you be able to share the workspace with me? It is strange that it takes a while to update. Are you running sweeps or regular wandb runs?

Also what version of wandb are you using?

@ArtsiomWB Thank you for your response. My current run number for this project is 3647. and yes i can share the workspace (code and wandb log), can you give me your wandb account or email? I was not using any sweeps feature i think it's just regular wandb runs. My wandb version is 0.16.6.

Totally!

Could you please send it to artsiom.skarakhod@wandb.com

Hi artsiom, so i found that the problem is gone these days. Thank for helping!