Issue running notebook

Question

Issue running notebook

ShvetankPrakash opened this issue 2 years ago · comments

After installing via pip install google-vizier and then cloning this repo to obtain the running_vizier.ipynb notebook, I ran the first cell and had the following error:

File ~/miniconda3/lib/python3.8/site-packages/vizier/_src/pyvizier/oss/metadata_util.py:6, in <module>
      3 from typing import Tuple, Union, Optional, TypeVar, Type, Literal
      5 from vizier._src.pyvizier.shared import trial
----> 6 from vizier.service import key_value_pb2
      7 from vizier.service import study_pb2
      8 from vizier.service import vizier_service_pb2

ImportError: cannot import name 'key_value_pb2' from 'vizier.service' (/home/.../miniconda3/lib/python3.8/site-packages/vizier/service/__init__.py)

Have other folks ran into this when trying to run the notebook and have any solutions?

xingyousong · Answer 1 · Thu Aug 25 2022 22:28:33 GMT+0800 (China Standard Time)

Hi @ShvetankPrakash, I suspect that this is an issue with the use of MiniConda to handle Python packages (I'm not sure if MiniConda supports Protocol Buffers).

Could you try the code via copy-pasting in a regular .py file, preferrably using a normal version of Python 3.9+?

Sagi Perel · Answer 2 · Thu Aug 25 2022 22:54:26 GMT+0800 (China Standard Time)

@ShvetankPrakash As part of pip install google-vizier, it is supposed to execute install.sh, to build the proto files locally and generate the pb2 files.
Can you please check the Python virtualenv you're running in, to see if the pb2 files were generated?
They likely are missing (pb2 files are regular Python libraries).

If not, can you execute install.sh in that virtualenv?

Best,
Sagi

Shvetank Prakash · Answer 3 · Fri Aug 26 2022 00:58:41 GMT+0800 (China Standard Time)

Hi folks thanks for replying!

@sagipe running install.sh resolved the cannot import key_value_pb2 error in the original issue message

However now I am getting the following error when trying to run the first cell of imports in the notebook

    from vizier.service import clients
  File "/home/sprakash/Documents/Repos/vizier/vizier/service/clients.py", line 9, in <module>
    from vizier.service import pyvizier as vz
  File "/home/sprakash/Documents/Repos/vizier/vizier/service/pyvizier/__init__.py", line 5, in <module>
    from vizier._src.pyvizier.oss import metadata_util
  File "/home/sprakash/Documents/Repos/vizier/vizier/_src/pyvizier/oss/metadata_util.py", line 119, in <module>
    metadata: common.Metadata) -> list[key_value_pb2.KeyValue]:
TypeError: 'type' object is not subscriptable

@xingyousong I get this same error when I simply try to copy and paste the code into a .py file and execute it too.

Any thoughts y'all? Thanks again!

xingyousong · Answer 4 · Fri Aug 26 2022 02:13:26 GMT+0800 (China Standard Time)

Ah ok - the next issue is that I realize there's a mix of pytyping via list (which is supported in newer versions of Python 3) and not List in the code, sorry - can you try Python 3.10?

Shvetank Prakash · Answer 5 · Fri Aug 26 2022 02:31:24 GMT+0800 (China Standard Time)

Alright looks like that worked :) Thanks @xingyousong!

The first cell with the imports throws this warning still now (but it runs to completion at least).

warning in stationary: failed to import cython module: falling back to numpy
warning in coregionalize: failed to import cython module: falling back to numpy
warning in choleskies: failed to import cython module: falling back to numpy

I am able to run the rest of the notebook successfully though it seems!
This is the final two cells output:

Does this look right to you @xingyousong @sagipe ?

Shvetank Prakash · Answer 6 · Fri Aug 26 2022 02:33:16 GMT+0800 (China Standard Time)

Two things I learned from this issue that were not clear to me before so I just want to note here for future reference if it helps out!

Need to run ./install.sh
Need to use Python 3.10

xingyousong · Answer 7 · Fri Aug 26 2022 02:42:17 GMT+0800 (China Standard Time)

@ShvetankPrakash Yeah the output looks correct - Note that some of the cells in certain colabs aren't meant to be run (e.g. the ones involving class definitions).

I'll send a commit to change back to the older Python typing for now to be consistent and avoid dict / list Pytyping.

Closing for now.

Shvetank Prakash · Answer 8 · Sat Aug 27 2022 01:26:53 GMT+0800 (China Standard Time)

One follow up question @xingyousong! I am trying to integrate your open source version of Vizier with CFU-Playground, a fully open source framework for designing TinyML accelerators. That framework currently uses Python 3.7. Is there a package version of your open source Vizier compatible with Python 3.7? Thanks!

Sagi Perel · Answer 9 · Sat Aug 27 2022 02:44:54 GMT+0800 (China Standard Time)

@ShvetankPrakash One of the nice things about OSS Vizier is that it's using a server-client architecture.

This means that you can install OSS Vizier, run the server on your machine (from the terminal, or from a notebook runtime with Python 3.10+), and then in a separate Colab / Python script / IPython, run the client to connect to the server.

If you want to run everything from Colab, then you could:

Run one Colab with Python 3.10+, and execute only the "Setting up the server" section
Run another Colab with Python 3.7, where you skip the "Setting up the server", but run "Setting up the client" and all other related sections.

Alternatively, run the Vizier server on the command line using:
cd vizier/demos
python run_vizier_server.py

And then run the Colab with Python 3.7, where you skip the "Setting up the server", but run "Setting up the client" and all other related sections.

Best,
Sagi

Shvetank Prakash · Answer 10 · Sat Aug 27 2022 03:43:37 GMT+0800 (China Standard Time)

@sagipe thank you so much for the great suggestion! This I think should work as a solution :)

One thing to mention: I did have to manually add the changes from @xingyousong latest PR to test it for now in order for the client code to be compatible with 3.7.

When I made those changes I was able to run the Colab notebook using 3.7 for the client side code and the server code using 3.10!

Thanks again for the terrific help to you both!

Shvetank Prakash · Answer 11 · Thu Oct 20 2022 03:26:58 GMT+0800 (China Standard Time)

Hi folks!

I was able to get Vizier integrated and working with CFU-Playground! :)

I had a follow up question that is related to this issue:

In the notebook running_vizier.ipynb in is there an example of how you Vizier parallelizes the following code block (in the notebook) when searching:

suggestions = study_client.suggest(count=5)
for suggestion in suggestions:
  x = suggestion.parameters['x']
  y = suggestion.parameters['y']
  print('Suggested Parameters (x,y):', x, y)
  final_measurement = vz.Measurement({'maximize_metric': evaluate(x, y)})
  suggestion.complete(final_measurement)

Would appreciate your insight on this @sagipe or @xingyousong , thank you so much!

xingyousong · Answer 12 · Thu Oct 20 2022 04:20:20 GMT+0800 (China Standard Time)

Hi @ShvetankPrakash, when we say parallelization, we mean that the server can handle multiple clients working on the same study.

So for example, if your objective can be computed in a single thread, then multi-threading can be used where each thread uses a client, see here for an example: https://github.com/google/vizier/blob/main/vizier/service/performance_test.py#L43

Another case is if an entire machine needs to be used to compute a single objective (e.g. in settings where you're training a large neural network). In this case, you'd have to perform some real distributed networking to have all of the worker machines connect to the server machine.

Shvetank Prakash · Answer 13 · Tue Dec 20 2022 15:17:37 GMT+0800 (China Standard Time)

Hi @xingyousong -

Appreciate the explanation, makes sense! I had one more follow up to make sure I understand that code block in my previous comment.

When the user requests 'count' number of suggestions, are all the suggestions determined at that moment or iteratively? As in when we write:

suggestions = study_client.suggest(count=5)
for suggestion in suggestions:
  ...
  suggestion.complete(final_measurement)

Is each iteration thru the for loop using a suggestion that takes into account the result of the previous suggestion? Just trying to understand how the feedback loop works for Vizier to explore the design space :) Thanks!

`

xingyousong · Answer 14 · Tue Dec 20 2022 21:27:51 GMT+0800 (China Standard Time)

Are all the suggestions determined at that moment or iteratively?
This depends on the algorithm and how it supports batched suggestions.

Certain algorithms (e.g. CMA-ES or Policy Gradients) naturally can produce batched suggestions by simply sampling multiple times IID from the output distribution. But note that this ignores pending suggestions already made and can unluckily lead to duplicate outputs.

Other algorithms such as Bayesian Optimization use hallucination and do take into consideration the intermediate pending suggestions (by imagining if they led to poor objective values) already outputted.

Shvetank Prakash · Answer 15 · Wed Dec 21 2022 00:40:17 GMT+0800 (China Standard Time)

@xingyousong I see! But that is all handled by the algo specified when instantiating the study_client ? As in, as a User we do not need to feedback the suggestion outputs ourselves to the study_client
(perhaps suggestion.complete(final_measurement) is what does that) ? Just trying to see if like at the end of the for loop there is anything I need to feed back in to get more suggestions.

xingyousong · Answer 16 · Wed Dec 21 2022 01:07:25 GMT+0800 (China Standard Time)

We conveniently wrapped the StudyClient API to return technically a TrialClient which also has a link to the server: https://github.com/google/vizier/blob/main/vizier/service/clients.py#L146, so those suggestions are actually instances of TrialClient.

So suggestion.complete(...) is all you need to do, and underneath the hood, the feedback will be sent to the server.

Shvetank Prakash · Answer 17 · Wed Dec 21 2022 01:21:25 GMT+0800 (China Standard Time)

Got it, thank you so much for your replies @xingyousong ! They really helped a lot :)

Sagi Perel · Answer 18 · Mon Jan 02 2023 09:35:28 GMT+0800 (China Standard Time)

To make sure this is clear for future readers:

For code like:
suggestions = study_client.suggest(count=5) for suggestion in suggestions: x = suggestion.parameters['x'] y = suggestion.parameters['y'] print('Suggested Parameters (x,y):', x, y) final_measurement = vz.Measurement({'maximize_metric': evaluate(x, y)}) suggestion.complete(final_measurement)

Users typically request batches of trials (instead of one trial), only when suggestion evaluation is very quick, so it's quicker to request batches of N trials and evaluate them quickly (or in parallel), than to request one trial at a time, evaluate it, and request another one.

If you want to evaluate multiple trials in parallel and have the algorithm be aware of all currently completed trials when generating the next suggestion; then it's better to run multiple processes/threads, each asking for study_client.suggest(count=1), evaluating it and calling suggestion.complete(final_measurement).

That way, slow evaluations don't hold up faster ones, and the server always has the latest completed trials available.