Issue running notebook
ShvetankPrakash opened this issue · comments
After installing via pip install google-vizier
and then cloning this repo to obtain the running_vizier.ipynb
notebook, I ran the first cell and had the following error:
File ~/miniconda3/lib/python3.8/site-packages/vizier/_src/pyvizier/oss/metadata_util.py:6, in <module>
3 from typing import Tuple, Union, Optional, TypeVar, Type, Literal
5 from vizier._src.pyvizier.shared import trial
----> 6 from vizier.service import key_value_pb2
7 from vizier.service import study_pb2
8 from vizier.service import vizier_service_pb2
ImportError: cannot import name 'key_value_pb2' from 'vizier.service' (/home/.../miniconda3/lib/python3.8/site-packages/vizier/service/__init__.py)
Have other folks ran into this when trying to run the notebook and have any solutions?
Hi @ShvetankPrakash, I suspect that this is an issue with the use of MiniConda to handle Python packages (I'm not sure if MiniConda supports Protocol Buffers).
Could you try the code via copy-pasting in a regular .py
file, preferrably using a normal version of Python 3.9+?
@ShvetankPrakash As part of pip install google-vizier
, it is supposed to execute install.sh
, to build the proto files locally and generate the pb2
files.
Can you please check the Python virtualenv you're running in, to see if the pb2
files were generated?
They likely are missing (pb2
files are regular Python libraries).
If not, can you execute install.sh
in that virtualenv?
Best,
Sagi
Hi folks thanks for replying!
@sagipe running install.sh resolved the cannot import key_value_pb2
error in the original issue message
However now I am getting the following error when trying to run the first cell of imports in the notebook
from vizier.service import clients
File "/home/sprakash/Documents/Repos/vizier/vizier/service/clients.py", line 9, in <module>
from vizier.service import pyvizier as vz
File "/home/sprakash/Documents/Repos/vizier/vizier/service/pyvizier/__init__.py", line 5, in <module>
from vizier._src.pyvizier.oss import metadata_util
File "/home/sprakash/Documents/Repos/vizier/vizier/_src/pyvizier/oss/metadata_util.py", line 119, in <module>
metadata: common.Metadata) -> list[key_value_pb2.KeyValue]:
TypeError: 'type' object is not subscriptable
@xingyousong I get this same error when I simply try to copy and paste the code into a .py
file and execute it too.
Any thoughts y'all? Thanks again!
Ah ok - the next issue is that I realize there's a mix of pytyping via list
(which is supported in newer versions of Python 3) and not List
in the code, sorry - can you try Python 3.10?
Alright looks like that worked :) Thanks @xingyousong!
The first cell with the imports throws this warning still now (but it runs to completion at least).
warning in stationary: failed to import cython module: falling back to numpy
warning in coregionalize: failed to import cython module: falling back to numpy
warning in choleskies: failed to import cython module: falling back to numpy
I am able to run the rest of the notebook successfully though it seems!
This is the final two cells output:
Does this look right to you @xingyousong @sagipe ?
Two things I learned from this issue that were not clear to me before so I just want to note here for future reference if it helps out!
- Need to run
./install.sh
- Need to use
Python 3.10
@ShvetankPrakash Yeah the output looks correct - Note that some of the cells in certain colabs aren't meant to be run (e.g. the ones involving class definitions).
I'll send a commit to change back to the older Python typing for now to be consistent and avoid dict
/ list
Pytyping.
Closing for now.
One follow up question @xingyousong! I am trying to integrate your open source version of Vizier with CFU-Playground, a fully open source framework for designing TinyML accelerators. That framework currently uses Python 3.7. Is there a package version of your open source Vizier compatible with Python 3.7? Thanks!
@ShvetankPrakash One of the nice things about OSS Vizier is that it's using a server-client architecture.
This means that you can install OSS Vizier, run the server on your machine (from the terminal, or from a notebook runtime with Python 3.10+), and then in a separate Colab / Python script / IPython, run the client to connect to the server.
If you want to run everything from Colab, then you could:
- Run one Colab with Python 3.10+, and execute only the "Setting up the server" section
- Run another Colab with Python 3.7, where you skip the "Setting up the server", but run "Setting up the client" and all other related sections.
Alternatively, run the Vizier server on the command line using:
cd vizier/demos
python run_vizier_server.py
And then run the Colab with Python 3.7, where you skip the "Setting up the server", but run "Setting up the client" and all other related sections.
Best,
Sagi
@sagipe thank you so much for the great suggestion! This I think should work as a solution :)
One thing to mention: I did have to manually add the changes from @xingyousong latest PR to test it for now in order for the client code to be compatible with 3.7.
When I made those changes I was able to run the Colab notebook using 3.7 for the client side code and the server code using 3.10!
Thanks again for the terrific help to you both!
Hi folks!
I was able to get Vizier integrated and working with CFU-Playground! :)
I had a follow up question that is related to this issue:
In the notebook running_vizier.ipynb
in is there an example of how you Vizier parallelizes the following code block (in the notebook) when searching:
suggestions = study_client.suggest(count=5)
for suggestion in suggestions:
x = suggestion.parameters['x']
y = suggestion.parameters['y']
print('Suggested Parameters (x,y):', x, y)
final_measurement = vz.Measurement({'maximize_metric': evaluate(x, y)})
suggestion.complete(final_measurement)
Would appreciate your insight on this @sagipe or @xingyousong , thank you so much!
Hi @ShvetankPrakash, when we say parallelization, we mean that the server can handle multiple clients working on the same study.
So for example, if your objective can be computed in a single thread, then multi-threading can be used where each thread uses a client, see here for an example: https://github.com/google/vizier/blob/main/vizier/service/performance_test.py#L43
Another case is if an entire machine needs to be used to compute a single objective (e.g. in settings where you're training a large neural network). In this case, you'd have to perform some real distributed networking to have all of the worker machines connect to the server machine.
Hi @xingyousong -
Appreciate the explanation, makes sense! I had one more follow up to make sure I understand that code block in my previous comment.
When the user requests 'count' number of suggestions
, are all the suggestions determined at that moment or iteratively? As in when we write:
suggestions = study_client.suggest(count=5)
for suggestion in suggestions:
...
suggestion.complete(final_measurement)
Is each iteration thru the for loop using a suggestion
that takes into account the result of the previous suggestion
? Just trying to understand how the feedback loop works for Vizier to explore the design space :) Thanks!
`
Are all the suggestions determined at that moment or iteratively?
This depends on the algorithm and how it supports batched suggestions.
Certain algorithms (e.g. CMA-ES or Policy Gradients) naturally can produce batched suggestions by simply sampling multiple times IID from the output distribution. But note that this ignores pending suggestions already made and can unluckily lead to duplicate outputs.
Other algorithms such as Bayesian Optimization use hallucination and do take into consideration the intermediate pending suggestions (by imagining if they led to poor objective values) already outputted.
@xingyousong I see! But that is all handled by the algo specified when instantiating the study_client
? As in, as a User we do not need to feedback the suggestion outputs ourselves to the study_client
(perhaps suggestion.complete(final_measurement)
is what does that) ? Just trying to see if like at the end of the for loop
there is anything I need to feed back in to get more suggestions.
We conveniently wrapped the StudyClient API to return technically a TrialClient
which also has a link to the server: https://github.com/google/vizier/blob/main/vizier/service/clients.py#L146, so those suggestions are actually instances of TrialClient
.
So suggestion.complete(...)
is all you need to do, and underneath the hood, the feedback will be sent to the server.
Got it, thank you so much for your replies @xingyousong ! They really helped a lot :)
To make sure this is clear for future readers:
For code like:
suggestions = study_client.suggest(count=5) for suggestion in suggestions: x = suggestion.parameters['x'] y = suggestion.parameters['y'] print('Suggested Parameters (x,y):', x, y) final_measurement = vz.Measurement({'maximize_metric': evaluate(x, y)}) suggestion.complete(final_measurement)
Users typically request batches of trials (instead of one trial), only when suggestion evaluation is very quick, so it's quicker to request batches of N trials and evaluate them quickly (or in parallel), than to request one trial at a time, evaluate it, and request another one.
If you want to evaluate multiple trials in parallel and have the algorithm be aware of all currently completed trials when generating the next suggestion; then it's better to run multiple processes/threads, each asking for study_client.suggest(count=1)
, evaluating it and calling suggestion.complete(final_measurement)
.
That way, slow evaluations don't hold up faster ones, and the server always has the latest completed trials available.