metadriverse / scenarionet

ScenarioNet: Scalable Traffic Scenario Management System for Autonomous Driving

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sim problem in Google Colab and Mac

huyuening opened this issue · comments

Command:
!python -m scenarionet.sim -d /content/exp_converted/ --render 3D

Result:
Known pipe types:
glxGraphicsPipe
(1 aux display modules not yet loaded.)
:ShowBase(warning): Unable to open 'onscreen' window.
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/usr/local/lib/python3.10/dist-packages/scenarionet/sim.py", line 52, in
env.reset(seed=index if args.scenario_index is None else args.scenario_index)
File "/usr/local/lib/python3.10/dist-packages/metadrive/envs/base_env.py", line 467, in reset
self.lazy_init() # it only works the first time when reset() is called to avoid the error when render
File "/usr/local/lib/python3.10/dist-packages/metadrive/envs/base_env.py", line 353, in lazy_init
initialize_engine(self.config)
File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/engine_utils.py", line 12, in initialize_engine
cls.singleton = cls(env_global_config)
File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/base_engine.py", line 36, in init
EngineCore.init(self, global_config)
File "/usr/local/lib/python3.10/dist-packages/metadrive/engine/core/engine_core.py", line 171, in init
super(EngineCore, self).init(windowType=self.mode)
File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 341, in init
self.openDefaultWindow(startDirect = False, props=props)
File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 1026, in openDefaultWindow
self.openMainWindow(*args, **kw)
File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 1061, in openMainWindow
self.openWindow(*args, **kw)
File "/usr/local/lib/python3.10/dist-packages/direct/showbase/ShowBase.py", line 806, in openWindow
raise Exception('Could not open window.')
Exception: Could not open window.

Whether colab can run?
Thanks.

The scenarionet/sim.py requires a screen for your machine to render the 3D scenarios, while colab machines don't have any video output devices and thus can not run this script. But if your notebook is running locally with a screen, of course, you can launch this script.

But you can still visualize the scenarios in colab :) A good workaround is to use the 2D pygame renderer, which don't need Xserver/screen and so on. Then you can save the frames to a GIF once you finished an episode and play that GIF.

An example is at: https://colab.research.google.com/github/metadriverse/metadrive/blob/main/metadrive/examples/Basic_MetaDrive_Usages.ipynb There is a section called Real-world Scenario Environment Visualization. All you need to pay attention is to use frame=env.render( mode="top_down", **extra_args ) to save a frame. and use

import pygame
import numpy as np
from PIL import Image

imgs = [pygame.surfarray.array3d(frame) for frame in frames]
imgs = [Image.fromarray(img) for img in imgs]
imgs[0].save("demo.gif", save_all=True, append_images=imgs[1:], duration=50, loop=0)
print("\nOpen gif...")
from IPython.display import Image
Image(open("demo.gif", 'rb').read())

to generate the GIF.

Thanks for your feedback. I should make this more clear and will add a colab example to this repo soon.

When I followed the Colab example, It worked. Thank you very much!

New problem
In ScenarioNet document, i only see the visualization. But i want to hijack a vehicle (e.g. AV), and using the algorithm to control this vehicle, what can i do?

i think the ScenarioNet document don't show the process.

If you wanna control the vehicle. Just remove the config "agent_policy" from the dict. After that, the agent policy will restore to the default ExternalInputPolicy which uses the input of env.step() to set the throttle or steering for the ego vehicle. The input dict is a two-dim vector [throttle, steering]. The values for both dims should be in the range [-1, 1]. Thus, for example, you can use env.step([0,1]) to make the car move forward.

This look like a simple control, just throttle and steering, and without perception and decision. I hope using the trained autonomous driving algorithm. In your example, ScenarioNet with ROS or OpenPilot can achieve this goal?

Well, it depends on how to build your autonomous driving system (ADS). Basically, an ADS is a mapping or function from image/lidar/imu to throttle/steering. The env.step() will return observation which contains image/lidar/imu data for the input of ADS. Then your ADS should produce [throttle, steering], which will be fed into the next env.step().

The pseudo-code is like:

my_ADS = ADS()
o,_ = env.reset()
for i in range(max_episode_len):
    action=my_ADS.compute_action(o)
    o, r, d, t, i =env.step(action)
    if d:
        break

Therefore, the decision should happen in the my_ADS.compute_action(o). You can make it as complex as the openpilot or as simple as an end-to-end RL policy. But even for the complex openpilot controller, it still follows the decision-making procedure above taking image as input and output throttle/steering.

Thanks for the answer. Is it possible to provide a simple end-to-end RL policy example in the documentation for easier understanding?

I cannot document too many details on training/desiging at this time. Sorry about it.

But we do include an end-to-end driving policy in the simulator. The source code is at https://github.com/metadriverse/metadrive/blob/main/metadrive/examples/ppo_expert/numpy_expert.py
The policy is a 3-layer MLP trained with a huge amount of data. It takes 240 pseudo lidar points, IMU, and navigation info as input and output throttle and steering.

To experience this policy, just run python -m metadrive.examples.drive_in_single_agent_env The autopilot mode means the car is controlled by the end2end policy.

I get it.

In addition to the Google Colab, the MacBook Air (Apple M1) has the similar problem.
What should I do? Thanks.

1. python -m scenarionet.sim -d /path/to/exp_converted --render 3D

:ShowBase(warning): Unable to open 'onscreen' window.
Traceback (most recent call last):
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/huyuening/mdsn/scenarionet/scenarionet/sim.py", line 52, in
env.reset(seed=index if args.scenario_index is None else args.scenario_index)
File "/Users/huyuening/mdsn/metadrive/metadrive/envs/base_env.py", line 557, in reset
self.lazy_init() # it only works the first time when reset() is called to avoid the error when render
File "/Users/huyuening/mdsn/metadrive/metadrive/envs/base_env.py", line 433, in lazy_init
initialize_engine(self.config)
File "/Users/huyuening/mdsn/metadrive/metadrive/engine/engine_utils.py", line 12, in initialize_engine
cls.singleton = cls(env_global_config)
File "/Users/huyuening/mdsn/metadrive/metadrive/engine/base_engine.py", line 55, in init
EngineCore.init(self, global_config)
File "/Users/huyuening/mdsn/metadrive/metadrive/engine/core/engine_core.py", line 183, in init
super(EngineCore, self).init(windowType=self.mode)
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 341, in init
self.openDefaultWindow(startDirect = False, props=props)
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 1026, in openDefaultWindow
self.openMainWindow(*args, **kw)
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 1061, in openMainWindow
self.openWindow(*args, **kw)
File "/Users/huyuening/opt/anaconda3/envs/scenarionet/lib/python3.9/site-packages/direct/showbase/ShowBase.py", line 806, in openWindow
raise Exception('Could not open window.')
Exception: Could not open window.

2.python -m scenarionet.sim -d /path/to/exp_converted --render advanced

[!!!] RenderPipeline Sorry, your GPU does not support compute shaders! Make sure you have the latest drivers. If you already have, your gpu might be too old, or you might be using the open source drivers on linux.

Hi Yuening,

Sorry about that. It is actually a known issue that Mac with the M-series chips can not launch the 3D rendering service. A workaround is still using the top-down renderer.

Quanyi

I cannot document too many details on training/desiging at this time. Sorry about it.

But we do include an end-to-end driving policy in the simulator. The source code is at https://github.com/metadriverse/metadrive/blob/main/metadrive/examples/ppo_expert/numpy_expert.py The policy is a 3-layer MLP trained with a huge amount of data. It takes 240 pseudo lidar points, IMU, and navigation info as input and output throttle and steering.

To experience this policy, just run python -m metadrive.examples.drive_in_single_agent_env The autopilot mode means the car is controlled by the end2end policy.

Currently, ppo_expert is applied as an example only in drive_in_single_agent_env on MetaDrive. However, my goal is to implement this policy in converted Waymo scenarios (by ScenarioNet) and control the ego car (self-driving car) in each scenario. Due to my limited capacity, I cannot accomplish this process by myself. Can you give me some help?

The ScenarioEnv is compatible with any reinforcement learning framework. I recommend setting the number of scenarios to 1 and using algorithms from stable-baselines3 to train your first policy in a single Waymo scene.

If you are familiar with Ray, you can build your training script based on this:

https://github.com/metadriverse/scenarionet/blob/main/scenarionet_training/scripts/train_waymo.py

I tried a training demo based on stable baselines3 in MetaDrive document (https://metadrive-simulator.readthedocs.io/en/latest/training.html), and I met some troubles.

import gymnasium as gym
import matplotlib.pyplot as plt
import os

from functools import partial
from IPython.display import clear_output
from IPython.display import Image
from metadrive.envs import MetaDriveEnv
from metadrive.envs import ScenarioEnv
from metadrive.utils import generate_gif
from stable_baselines3 import PPO
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.vec_env.subproc_vec_env import SubprocVecEnv

num_scenarios = 50000

def waymo_env(need_monitor=False):
    env = ScenarioEnv(
        dict(
            # manual_control=False,
            # reactive_traffic=False,
            # use_render=False,
            data_directory="/content/drive/MyDrive/exp_converted",
            num_scenarios=num_scenarios
        )
    )
    if need_monitor:
        env = Monitor(env)
    return env

# 8 subprocess to rollout
train_env=SubprocVecEnv([partial(waymo_env, True) for _ in range(8)])
# train_env=waymo_env()

model = PPO("MlpPolicy",
            train_env,
            n_steps=4096,
            verbose=1)
model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000,
            log_interval=4)
# model.learn(total_timesteps=300_000, progress_bar=True)
train_env.close()
model.save("/content/drive/MyDrive/Autonomous_Driving_Algorithm_Waymo")

clear_output()
print("Training is finished! Generate gif ...")

# model.load("/content/drive/MyDrive/Autonomous_Driving_Algorithm_Waymo")

# evaluation
for seed in range(num_scenarios):
    try:
        total_reward = 0
        env=waymo_env()
        obs, _ = env.reset(seed=seed)
        for i in range(1000):
            action, _states = model.predict(obs, deterministic=True)
            obs, reward, done, _, info = env.step(action)
            total_reward += reward
            ret = env.render(mode="topdown",
                            screen_record=True,
                            window=False,
                            film_size=(1200, 1200)
                            # screen_size=(600, 600),
                            # camera_position=(50, 50)
                            )
            if done:
                print("episode_reward", total_reward)
                break

        env.top_down_renderer.generate_gif("scenario_{}.gif".format(seed))
    finally:
        env.close()
print("gif generation is finished ...")
  1. When I selected train_env=SubprocVecEnv([partial(waymo_env, True) for _ in range(8)])
EOFError                                  Traceback (most recent call last)
[<ipython-input-16-f442555c378c>](https://localhost:8080/#) in <cell line: 39>()
     37             n_steps=4096,
     38             verbose=1)
---> 39 model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000,
     40             log_interval=4)
     41 # model.learn(total_timesteps=300_000, progress_bar=True)

8 frames
[/usr/lib/python3.10/multiprocessing/connection.py](https://localhost:8080/#) in _recv(self, size, read)
    381             if n == 0:
    382                 if remaining == size:
--> 383                     raise EOFError
    384                 else:
    385                     raise OSError("got end of file during message")

EOFError:
  1. When I selected train_env=waymo_env()
ValueError                                Traceback (most recent call last)
[<ipython-input-15-6e9517b15ea7>](https://localhost:8080/#) in <cell line: 39>()
     37             n_steps=4096,
     38             verbose=1)
---> 39 model.learn(total_timesteps=25_000 if os.getenv('TEST_DOC') else 300_000,
     40             log_interval=4)
     41 # model.learn(total_timesteps=300_000, progress_bar=True)

17 frames
[/usr/local/lib/python3.10/dist-packages/metadrive/utils/vertex.py](https://localhost:8080/#) in is_anticlockwise(points)
     64     n = len(points)
     65     for i in range(n):
---> 66         x1, y1 = points[i]
     67         x2, y2 = points[(i + 1) % n]  # The next point, wrapping around to the first
     68         sum += (x2 - x1) * (y2 + y1)

ValueError: too many values to unpack (expected 2)
  1. Did you use this training demo in converted Waymo Open Motion Dataset? When I used a small number of scenarios as training set, It worked. But the training/evaluation result is relatively poor. The target vehicle couldn't follow the reference trajectory. Could you give me some suggestions?

I have no idea about question 1. But for question, it seems a problem of metadrive. Please set show_crosswalk and show_sidewalk as False to see if it is fixed For the last problem, you have to increase the number of samples, 300_000 is not enough. Generally, 1 million is the minimum requirement. If you use PPO, the total number of steps should be increased to 10 million.

Yes, I set show_crosswalk=False and show_sidewalk=False, the question 2 is solved.

I choose training parameters:

num_scenarios = 70,000
total_timestamps = 10,000,000

PPO_training
The figure shows the training process. Can I consider the training has achieved good results after 4 million timesteps. What is the meaning of some key indicators in the log, for example, why can ep_len_mean and ep_rew_mean reach quite large values?

Yeah, the reward is pretty high. You can visualize the scenario to see if it works well.

By the way, the bug in the second problem should be fixed already. Could you pull the latest MetaDrive and enable show_sidewalk and show_crosswalk to see if it still happens?

I think the default PPO algorithm designing is not suitable for Waymo dataset.

  1. The trained scenario sometimes don't show up completely.

scenario_0 (7)

  1. The waymo dataset totals 20 seconds, but the trained scenarios will exceed 20 seconds, which may cause the reward to increase all the time.

scenario_1 (8)

  1. ......

In your paper, the algorithm is only applied to the nuPlan and PG datasets. So, can you adjust the reward function and termination conditions to fit the Waymo dataset?

By the way, the bug in the second problem should be fixed already. Could you pull the latest MetaDrive and enable show_sidewalk and show_crosswalk to see if it still happens?

The bug seems to be solved.

For the first problem, there is a key map_region_size in env_config which may address this issue by assigning it a larger value such as 1024. Also, if you are using topdown renderer, the clipping brought by film_size may result in this as well. A larger film size may address this as well. Please refer to https://metadrive-simulator.readthedocs.io/en/latest/top_down_render.html for more details.

For the second problem, you can set horizion=300 or so on to terminate the environment so the environment step and reward won't increase forever.

For the third problem, I believe the reward function and termination condition can be generalized to the Waymo dataset. The problem that you can not get a good result could be

  1. The traffic can not react to ego car, which results in unreasonable collisions. Turn on reactive traffic to enable reactive traffic.
  2. The algorithm parameter may not be appropriate. Please refer to the settings here https://github.com/metadriverse/scenarionet/blob/main/scenarionet_training/scripts/train_waymo.py

Thanks. I'll try later.

I also tried train_waymo.py for convenience, but I ran into problems with insufficient memory, so how could I reduce memory usage?

2024-02-14 03:26:27,873	ERROR trial_runner.py:567 -- Trial MultiWorkerPPO_GymEnvWrapper_306d0_00001: Error processing event.
Traceback (most recent call last):
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 515, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 488, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/worker.py", line 1428, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError: ray::MultiWorkerPPO.train() (pid=12716, ip=172.28.0.12)
  File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 516, in train
    raise e
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 505, in train
    result = Trainable.train(self)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trainable.py", line 336, in train
    result = self.step()
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer_template.py", line 134, in step
    res = next(self.train_exec_impl)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 756, in __next__
    return next(self.built_iterator)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 843, in apply_filter
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 876, in apply_flatten
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 828, in add_wait_hooks
    item = next(it)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 783, in apply_foreach
    for item in it:
  [Previous line repeated 1 more time]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/util/iter.py", line 471, in base_iterator
    yield ray.get(futures, timeout=timeout)
ray.exceptions.RayActorError: The actor died unexpectedly before finishing this task.
  1. The memory for each MetaDrive instance depends on the num of scenarios. So reduce the num of scenarios in the training and evaluation environment will lead to less memory usage.
  2. As we are using PPO and ray to do parallel data sampling and evaluation, thus each trial will launch num_workers + num_evaluation workers MetaDrive instance. Thus, using less workers can achieve this as well.
  3. That script launches 5 experiments concurrently, which means there will be in total 5 * (num_workers + num_evaluation workers) MetaDrive instance in your system. The last way is to run less experiments for example, 1 experiment.

I have some problems with the PPO algorithm training:

  1. it seems strange that no result appears in the first 21500s, and then output the result every 250s;
  2. An error is reported during the training process: ValueError: Summary file is not found at /content/drive/MyDrive/mdsn/scenarionet/dataset/waymo_test/dataset_summary.pkl!
/content/drive/MyDrive/mdsn/scenarionet
WARNING:tensorflow:From /usr/local/envs/scenarionet/lib/python3.8/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
Successfully initialize Ray!
Available resources:  {'memory': 552.0, 'CPU': 8.0, 'object_store_memory': 190.0, 'node:172.28.0.12': 1.0}
We are using this wandb key file:  /content/drive/MyDrive/mdsn/scenarionet/scenarionet_training/wandb_utils/wandb_api_key_file.txt
== Status ==
Memory usage on this node: 7.1/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------+--------+
| Trial name                               | status   | loc   |   seed |
|------------------------------------------+----------+-------+--------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  |       |      0 |
+------------------------------------------+----------+-------+--------+


wandb: Currently logged in as: deep-learning-for-av (use `wandb login --relogin` to force relogin)
wandb: wandb version 0.16.3 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.12.1
wandb: Syncing run TEST_aa36c_00000
wandb:  View project at https://wandb.ai/deep-learning-for-av/scenarionet
wandb:  View run at https://wandb.ai/deep-learning-for-av/scenarionet/runs/aa36c_00000
wandb: Run data is saved locally in /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000
wandb: Run `wandb offline` to turn off syncing.

== Status ==
Memory usage on this node: 44.1/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |    ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      1 |          20426.5 | 52000 |  4.78439 |  0.980995 |  0.0533757 | 0.399108 |  0.0150164 |  11.7529 | 6.15673 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+-------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.2/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      2 |          20694.8 | 104000 | -3.02698 |   0.45509 |   0.108865 | 0.137725 |   0.407186 |  303.275 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.5/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |     out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      3 |          21006.7 | 156000 | 0.437018 |  0.540404 |   0.113145 | 0.10101 |   0.358586 |  260.495 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+---------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.6/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |    reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      4 |          21267.8 | 208000 | -0.167582 |  0.471204 |   0.115123 | 0.157068 |   0.371728 |  281.602 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.8/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      5 |          21518.1 | 260000 |  -1.1381 |  0.494845 |   0.115148 | 0.164948 |   0.340206 |  266.696 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.8/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      6 |            21793 | 312000 | -2.14413 |  0.494624 |   0.115176 | 0.129032 |   0.376344 |  281.011 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.9/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      7 |          22076.6 | 364000 | 0.174395 |  0.494792 |   0.115156 | 0.161458 |    0.34375 |  268.167 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 44.9/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      8 |          22358.1 | 416000 | -2.15311 |  0.469274 |    0.11516 | 0.145251 |   0.385475 |  291.536 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.0/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |    reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |      9 |            22664 | 468000 | -0.265232 |  0.535211 |   0.115132 | 0.169014 |   0.295775 |  243.624 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.0/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |     reward |   success |   coverage |     out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |     10 |          22937.6 | 520000 | -0.0915441 |  0.450867 |   0.115205 | 0.17341 |   0.375723 |  304.231 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+------------+-----------+------------+---------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.1/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |     11 |          23208.4 | 572000 | -2.13014 |  0.539604 |   0.115127 | 0.163366 |    0.29703 |  249.723 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.1/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |    reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |     12 |          23482.6 | 624000 | -0.258691 |  0.455056 |   0.115216 | 0.174157 |   0.370787 |  298.253 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.2/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |    reward |   success |   coverage |      out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |     13 |            23745 | 676000 | -0.545498 |  0.527919 |   0.115121 | 0.147208 |   0.324873 |   257.33 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+-----------+-----------+------------+----------+------------+----------+---------+


== Status ==
Memory usage on this node: 45.2/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 5.8/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 RUNNING)
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+
| Trial name                               | status   | loc               |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |   out |   max_step |   length |   level |
|------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | RUNNING  | 172.28.0.12:17417 |      0 |     14 |          24023.6 | 728000 | -1.42022 |   0.48913 |     0.1152 | 0.125 |    0.38587 |  292.761 |      13 |
+------------------------------------------+----------+-------------------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+


2024-02-15 12:39:54,398	ERROR trial_runner.py:567 -- Trial MultiWorkerPPO_GymEnvWrapper_aa36c_00000: Error processing event.
Traceback (most recent call last):
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/trial_runner.py", line 515, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/ray_trial_executor.py", line 488, in fetch_result
    result = ray.get(trial_future[0], timeout=DEFAULT_GET_TIMEOUT)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/worker.py", line 1428, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray::MultiWorkerPPO.train() (pid=17417, ip=172.28.0.12)
  File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 531, in train
    evaluation_metrics = self._evaluate()
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/agents/trainer.py", line 747, in _evaluate
    ray.get([
ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.sample() (pid=17730, ip=172.28.0.12)
  File "python/ray/_raylet.pyx", line 484, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 438, in ray._raylet.execute_task.function_executor
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 579, in sample
    batches = [self.input_reader.next()]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 93, in next
    batches = [self.get_data()]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 209, in get_data
    item = next(self.rollout_provider)
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/evaluation/sampler.py", line 548, in _env_runner
    base_env.poll()
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/base_env.py", line 325, in poll
    self.new_obs = self.vector_env.vector_reset()
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/vector_env.py", line 133, in vector_reset
    return [e.reset() for e in self.envs]
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/rllib/env/vector_env.py", line 133, in <listcomp>
    return [e.reset() for e in self.envs]
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/gym_wrapper.py", line 107, in reset
    obs, _ = self._inner.reset(**not_none_params)
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/base_env.py", line 522, in reset
    self.lazy_init()  # it only works the first time when reset() is called to avoid the error when render
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/base_env.py", line 411, in lazy_init
    self.setup_engine()
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/envs/scenario_env.py", line 120, in setup_engine
    self.engine.register_manager("data_manager", ScenarioDataManager())
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/manager/scenario_data_manager.py", line 36, in __init__
    self.summary_dict, self.summary_lookup, self.mapping = read_dataset_summary(self.directory)
  File "/content/drive/MyDrive/mdsn/metadrive/metadrive/scenario/utils.py", line 379, in read_dataset_summary
    raise ValueError(f"Summary file is not found at {summary_file}!")
ValueError: Summary file is not found at /content/drive/MyDrive/mdsn/scenarionet/dataset/waymo_test/dataset_summary.pkl!

wandb: Waiting for W&B process to finish, PID 17595
wandb: Program ended successfully.
wandb:                                                                                
wandb: Find user logs for this run at: /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000/logs/debug.log
wandb: Find internal logs for this run at: /content/drive/MyDrive/mdsn/scenarionet/wandb/run-20240215_055421-aa36c_00000/logs/debug-internal.log
wandb: Run summary:
wandb:                                          episode_reward_max 9.55808
wandb:                                          episode_reward_min -236.17933
wandb:                                         episode_reward_mean -1.42022
wandb:                                            episode_len_mean 292.76087
wandb:                                          episodes_this_iter 184
wandb:                                         num_healthy_workers 8
wandb:                                             timesteps_total 728000
wandb:                                              episodes_total 6716
wandb:                                          training_iteration 14
wandb:                                                   timestamp 1708000507
wandb:                                            time_this_iter_s 278.52195
wandb:                                                time_total_s 24023.57016
wandb:                                          time_since_restore 24023.57016
wandb:                                     timesteps_since_restore 0
wandb:                                    iterations_since_restore 14
wandb:                                                     success 0.48913
wandb:                                                         out 0.125
wandb:                                                    max_step 0.38587
wandb:                                                       level 13.0
wandb:                                                      length 292.76087
wandb:                                                    coverage 0.1152
wandb:                            custom_metrics/success_rate_mean 0.48913
wandb:                             custom_metrics/success_rate_min 0.0
wandb:                             custom_metrics/success_rate_max 1.0
wandb:                              custom_metrics/crash_rate_mean 0.0163
wandb:                               custom_metrics/crash_rate_min 0.0
wandb:                               custom_metrics/crash_rate_max 1.0
wandb:                        custom_metrics/out_of_road_rate_mean 0.125
wandb:                         custom_metrics/out_of_road_rate_min 0.0
wandb:                         custom_metrics/out_of_road_rate_max 1.0
wandb:                           custom_metrics/max_step_rate_mean 0.38587
wandb:                            custom_metrics/max_step_rate_min 0.0
wandb:                            custom_metrics/max_step_rate_max 1.0
wandb:                            custom_metrics/velocity_max_mean 1.45416
wandb:                             custom_metrics/velocity_max_min 0.00064
wandb:                             custom_metrics/velocity_max_max 8.61268
wandb:                           custom_metrics/velocity_mean_mean 0.29462
wandb:                            custom_metrics/velocity_mean_min 0.00064
wandb:                            custom_metrics/velocity_mean_max 4.31243
wandb:                            custom_metrics/velocity_min_mean 0.14046
wandb:                             custom_metrics/velocity_min_min 0.00049
wandb:                             custom_metrics/velocity_min_max 2.70274
wandb:                        custom_metrics/lateral_dist_min_mean -0.12593
wandb:                         custom_metrics/lateral_dist_min_min -2.00977
wandb:                         custom_metrics/lateral_dist_min_max 0.01272
wandb:                        custom_metrics/lateral_dist_max_mean 0.52085
wandb:                         custom_metrics/lateral_dist_max_min -0.01561
wandb:                         custom_metrics/lateral_dist_max_max 2.11605
wandb:                       custom_metrics/lateral_dist_mean_mean 0.16722
wandb:                        custom_metrics/lateral_dist_mean_min -1.08002
wandb:                        custom_metrics/lateral_dist_mean_max 1.39205
wandb:                            custom_metrics/steering_max_mean 0.58435
wandb:                             custom_metrics/steering_max_min -1.0
wandb:                             custom_metrics/steering_max_max 1.0
wandb:                           custom_metrics/steering_mean_mean -0.0227
wandb:                            custom_metrics/steering_mean_min -1.0
wandb:                            custom_metrics/steering_mean_max 1.0
wandb:                            custom_metrics/steering_min_mean -0.55695
wandb:                             custom_metrics/steering_min_min -1.0
wandb:                             custom_metrics/steering_min_max 1.0
wandb:                        custom_metrics/acceleration_min_mean -0.58033
wandb:                         custom_metrics/acceleration_min_min -1.0
wandb:                         custom_metrics/acceleration_min_max 1.0
wandb:                       custom_metrics/acceleration_mean_mean -0.00917
wandb:                        custom_metrics/acceleration_mean_min -1.0
wandb:                        custom_metrics/acceleration_mean_max 1.0
wandb:                        custom_metrics/acceleration_max_mean 0.56098
wandb:                         custom_metrics/acceleration_max_min -1.0
wandb:                         custom_metrics/acceleration_max_max 1.0
wandb:                         custom_metrics/step_reward_max_mean 0.1152
wandb:                          custom_metrics/step_reward_max_min 0.0
wandb:                          custom_metrics/step_reward_max_max 0.96045
wandb:                        custom_metrics/step_reward_mean_mean 0.00689
wandb:                         custom_metrics/step_reward_mean_min -0.46142
wandb:                         custom_metrics/step_reward_mean_max 0.96045
wandb:                         custom_metrics/step_reward_min_mean -0.23528
wandb:                          custom_metrics/step_reward_min_min -2.0
wandb:                          custom_metrics/step_reward_min_max 0.96045
wandb:                                    custom_metrics/cost_mean 2.27174
wandb:                                     custom_metrics/cost_min 0.0
wandb:                                     custom_metrics/cost_max 119.0
wandb:                       custom_metrics/num_crash_vehicle_mean 1.52174
wandb:                        custom_metrics/num_crash_vehicle_min 0.0
wandb:                        custom_metrics/num_crash_vehicle_max 119.0
wandb:                        custom_metrics/num_crash_object_mean 0.0
wandb:                         custom_metrics/num_crash_object_min 0.0
wandb:                         custom_metrics/num_crash_object_max 0.0
wandb:                         custom_metrics/num_crash_human_mean 0.625
wandb:                          custom_metrics/num_crash_human_min 0.0
wandb:                          custom_metrics/num_crash_human_max 24.0
wandb:                             custom_metrics/num_on_line_mean 0.0
wandb:                              custom_metrics/num_on_line_min 0.0
wandb:                              custom_metrics/num_on_line_max 0.0
wandb:                     custom_metrics/step_reward_lateral_mean -0.13162
wandb:                      custom_metrics/step_reward_lateral_min -0.69616
wandb:                      custom_metrics/step_reward_lateral_max 0.0
wandb:                     custom_metrics/step_reward_heading_mean -0.01812
wandb:                      custom_metrics/step_reward_heading_min -0.18567
wandb:                      custom_metrics/step_reward_heading_max 0.96814
wandb:               custom_metrics/step_reward_action_smooth_mean 0.0
wandb:                custom_metrics/step_reward_action_smooth_min 0.0
wandb:                custom_metrics/step_reward_action_smooth_max 0.0
wandb:                        custom_metrics/route_completion_mean 0.19836
wandb:                         custom_metrics/route_completion_min -0.0096
wandb:                         custom_metrics/route_completion_max 1.02506
wandb:                        custom_metrics/curriculum_level_mean 13.0
wandb:                         custom_metrics/curriculum_level_min 13
wandb:                         custom_metrics/curriculum_level_max 13
wandb:                          custom_metrics/scenario_index_mean 5420.34239
wandb:                           custom_metrics/scenario_index_min 5203
wandb:                           custom_metrics/scenario_index_max 5598
wandb:                            custom_metrics/track_length_mean 19.71836
wandb:                             custom_metrics/track_length_min 0.25524
wandb:                             custom_metrics/track_length_max 50.07087
wandb:                         custom_metrics/num_stored_maps_mean 50.0
wandb:                          custom_metrics/num_stored_maps_min 50
wandb:                          custom_metrics/num_stored_maps_max 50
wandb:                     custom_metrics/scenario_difficulty_mean 53.01857
wandb:                      custom_metrics/scenario_difficulty_min 39.41887
wandb:                      custom_metrics/scenario_difficulty_max 63.16812
wandb:                           custom_metrics/data_coverage_mean 0.1152
wandb:                            custom_metrics/data_coverage_min 0.1148
wandb:                            custom_metrics/data_coverage_max 0.116
wandb:                      custom_metrics/curriculum_success_mean 0.49522
wandb:                       custom_metrics/curriculum_success_min 0.38
wandb:                       custom_metrics/curriculum_success_max 0.56
wandb:             custom_metrics/curriculum_route_completion_mean 0.18768
wandb:              custom_metrics/curriculum_route_completion_min 0.15351
wandb:              custom_metrics/curriculum_route_completion_max 0.25077
wandb:                               sampler_perf/mean_env_wait_ms 233.91001
wandb:                     sampler_perf/mean_raw_obs_processing_ms 8.99504
wandb:                              sampler_perf/mean_inference_ms 2.92219
wandb:                      sampler_perf/mean_action_processing_ms 0.26235
wandb:                                       timers/sample_time_ms 234864.467
wandb:                                    timers/sample_throughput 221.404
wandb:                                         timers/load_time_ms 83.92
wandb:                                      timers/load_throughput 619634.743
wandb:                                        timers/learn_time_ms 40402.186
wandb:                                     timers/learn_throughput 1287.059
wandb:                                       timers/update_time_ms 6.98
wandb:                                      info/num_steps_sampled 728000
wandb:                                      info/num_steps_trained 728000
wandb:                                          config/num_workers 8
wandb:                                  config/num_envs_per_worker 1
wandb:                              config/rollout_fragment_length 500
wandb:                                             config/num_gpus 0
wandb:                                     config/train_batch_size 50000
wandb:                                                config/gamma 0.99
wandb:                                              config/horizon 600
wandb:                                         config/soft_horizon False
wandb:                                       config/no_done_at_end False
wandb:                                    config/normalize_actions False
wandb:                                         config/clip_actions True
wandb:                                                   config/lr 0.0001
wandb:                                              config/monitor False
wandb:                               config/ignore_worker_failures False
wandb:                                        config/log_sys_usage True
wandb:                                         config/fake_sampler False
wandb:                                        config/eager_tracing False
wandb:                                  config/no_eager_on_workers False
wandb:                                              config/explore True
wandb:                                  config/evaluation_interval 15
wandb:                              config/evaluation_num_episodes 1000
wandb:                                        config/in_evaluation False
wandb:                               config/evaluation_num_workers 8
wandb:                                         config/sample_async False
wandb:                             config/_use_trajectory_view_api False
wandb:                                  config/synchronize_filters True
wandb:                                config/compress_observations False
wandb:                              config/collect_metrics_timeout 180
wandb:                           config/metrics_smoothing_episodes 10
wandb:                                   config/remote_worker_envs False
wandb:                             config/remote_env_batch_wait_ms 0
wandb:                                      config/min_iter_time_s 0
wandb:                              config/timesteps_per_iteration 0
wandb:                                                 config/seed 0
wandb:                                  config/num_cpus_per_worker 0.3
wandb:                                  config/num_gpus_per_worker 0
wandb:                                  config/num_cpus_for_driver 1
wandb:                                               config/memory 0
wandb:                                  config/object_store_memory 0
wandb:                                    config/memory_per_worker 0
wandb:                       config/object_store_memory_per_worker 0
wandb:                                   config/postprocess_inputs False
wandb:                                  config/shuffle_buffer_size 0
wandb:                                 config/output_max_file_size 67108864
wandb:                               config/replay_sequence_length 1
wandb:                                           config/use_critic True
wandb:                                              config/use_gae True
wandb:                                               config/lambda 1.0
wandb:                                             config/kl_coeff 0.2
wandb:                                   config/sgd_minibatch_size 200
wandb:                                    config/shuffle_sequences True
wandb:                                         config/num_sgd_iter 20
wandb:                                      config/vf_share_layers False
wandb:                                        config/vf_loss_coeff 1.0
wandb:                                        config/entropy_coeff 0.0
wandb:                                           config/clip_param 0.3
wandb:                                        config/vf_clip_param 10.0
wandb:                                            config/kl_target 0.01
wandb:                                     config/simple_optimizer False
wandb:                                           config/_fake_gpus False
wandb:                                       perf/cpu_util_percent 77.02267
wandb:                                       perf/ram_util_percent 88.78766
wandb:                                   config/model/free_log_std False
wandb:                                config/model/no_final_linear False
wandb:                                config/model/vf_share_layers True
wandb:                                       config/model/use_lstm False
wandb:                                    config/model/max_seq_len 20
wandb:                                 config/model/lstm_cell_size 256
wandb:                    config/model/lstm_use_prev_action_reward False
wandb:                                    config/model/_time_major False
wandb:                                     config/model/framestack True
wandb:                                            config/model/dim 84
wandb:                                      config/model/grayscale False
wandb:                                      config/model/zero_mean True
wandb:                      config/env_config/start_scenario_index 0
wandb:                             config/env_config/num_scenarios 40000
wandb:                           config/env_config/sequential_seed True
wandb:                          config/env_config/curriculum_level 100
wandb:                       config/env_config/target_success_rate 0.8
wandb:                          config/env_config/reactive_traffic True
wandb:                        config/env_config/no_static_vehicles True
wandb:                                  config/env_config/no_light True
wandb:                     config/env_config/static_traffic_object True
wandb:                            config/env_config/driving_reward 1
wandb:                    config/env_config/steering_range_penalty 0
wandb:                           config/env_config/heading_penalty 1
wandb:                           config/env_config/lateral_penalty 1.0
wandb:                        config/env_config/no_negative_reward True
wandb:                      config/env_config/on_lane_line_penalty 0
wandb:                     config/env_config/crash_vehicle_penalty 2
wandb:                       config/env_config/crash_human_penalty 2
wandb:                       config/env_config/out_of_road_penalty 2
wandb:                          config/env_config/max_lateral_dist 2
wandb:         config/tf_session_args/intra_op_parallelism_threads 2
wandb:         config/tf_session_args/inter_op_parallelism_threads 2
wandb:                 config/tf_session_args/log_device_placement False
wandb:                 config/tf_session_args/allow_soft_placement True
wandb:   config/local_tf_session_args/intra_op_parallelism_threads 8
wandb:   config/local_tf_session_args/inter_op_parallelism_threads 8
wandb:                    info/learner/default_policy/cur_kl_coeff 0.2
wandb:                          info/learner/default_policy/cur_lr 0.0001
wandb:                      info/learner/default_policy/total_loss 18.57894
wandb:                     info/learner/default_policy/policy_loss -0.03165
wandb:                         info/learner/default_policy/vf_loss 18.60693
wandb:                info/learner/default_policy/vf_explained_var 0.702
wandb:                              info/learner/default_policy/kl 0.01829
wandb:                         info/learner/default_policy/entropy 3.11301
wandb:                   info/learner/default_policy/entropy_coeff 0.0
wandb:    config/evaluation_config/env_config/start_scenario_index 0
wandb:           config/evaluation_config/env_config/num_scenarios 1000
wandb:         config/evaluation_config/env_config/sequential_seed True
wandb:        config/evaluation_config/env_config/curriculum_level 1
wandb:             config/tf_session_args/gpu_options/allow_growth True
wandb:                     config/tf_session_args/device_count/CPU 1
wandb:                       config/logger_config/wandb/log_config True
wandb:   config/env_config/vehicle_config/side_detector/num_lasers 0
wandb:                                                    _runtime 24045
wandb:                                                  _timestamp 1708000507
wandb:                                                       _step 13
wandb: Run history:
wandb:                                          episode_reward_max ▁▁▁█▁▂▁▁▁█▁▁▁▁
wandb:                                          episode_reward_min █▃▇▇▇▁▇▆▇▃▅█▇▃
wandb:                                         episode_reward_mean █▁▄▄▃▂▄▂▃▄▂▃▃▂
wandb:                                            episode_len_mean ▁█▇▇▇▇▇█▇█▇█▇█
wandb:                                          episodes_this_iter █▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         num_healthy_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                             timesteps_total ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                              episodes_total ▁▁▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                          training_iteration ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                                   timestamp ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                            time_this_iter_s █▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                                time_total_s ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                          time_since_restore ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                     timesteps_since_restore ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    iterations_since_restore ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                                     success █▁▂▁▂▂▂▁▂▁▂▁▂▂
wandb:                                                         out █▂▁▂▃▂▂▂▃▃▂▃▂▂
wandb:                                                    max_step ▁█▇▇▇▇▇█▆▇▆▇▇█
wandb:                                                       level ▁█████████████
wandb:                                                      length ▁█▇▇▇▇▇█▇█▇█▇█
wandb:                                                    coverage ▁▇████████████
wandb:                            custom_metrics/success_rate_mean █▁▂▁▂▂▂▁▂▁▂▁▂▂
wandb:                             custom_metrics/success_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             custom_metrics/success_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              custom_metrics/crash_rate_mean ▁▇▃█▆▆▄█▇▃▅▆▅▄
wandb:                               custom_metrics/crash_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                               custom_metrics/crash_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/out_of_road_rate_mean █▂▁▂▃▂▂▂▃▃▂▃▂▂
wandb:                         custom_metrics/out_of_road_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/out_of_road_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           custom_metrics/max_step_rate_mean ▁█▇▇▇▇▇█▆▇▆▇▇█
wandb:                            custom_metrics/max_step_rate_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                            custom_metrics/max_step_rate_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                            custom_metrics/velocity_max_mean ▁█▆█▇▇▆▇▇▇▇▇▇▆
wandb:                             custom_metrics/velocity_max_min ▁▄▁▅▂▃▃▄▁█▃▅▂▁
wandb:                             custom_metrics/velocity_max_max ▇▄▄▆▁▆▂▅█▆▄▄▄▄
wandb:                           custom_metrics/velocity_mean_mean ▁█▅█▇▇▆▇▇▇█▇▇▇
wandb:                            custom_metrics/velocity_mean_min ▁▄▁▅▂▃▃▄▁█▃▅▂▁
wandb:                            custom_metrics/velocity_mean_max ▃▃▁▂█▂▅▃▆▂▂▂▁▇
wandb:                            custom_metrics/velocity_min_mean ▄▇▁▆▇▅▂▄▅▅█▅▇▅
wandb:                             custom_metrics/velocity_min_min ▇▅▃█▅▄▇▃▆▁▆▆▄▇
wandb:                             custom_metrics/velocity_min_max ▅▂▁▁█▂▁▂▂▁▂▂▂▆
wandb:                        custom_metrics/lateral_dist_min_mean █▂▄▁▁▃▅▃▅▅▅▅▅▆
wandb:                         custom_metrics/lateral_dist_min_min ▆▂▇▁▆▃▄▃▇█▅▄▇█
wandb:                         custom_metrics/lateral_dist_min_max ▂▇▁▆▂▅▄█▅█▃▅▃▄
wandb:                        custom_metrics/lateral_dist_max_mean ▁▄▄▄▄▅▆▅▇█▇█▆▇
wandb:                         custom_metrics/lateral_dist_max_min ▄█▂▃█▅▄▆▄▅█▅▇▁
wandb:                         custom_metrics/lateral_dist_max_max ▁▇▂▃▂▇▁▁▇▄█▃▄▃
wandb:                       custom_metrics/lateral_dist_mean_mean ▃▃▃▁▂▄▆▄▇█▆▇▆▇
wandb:                        custom_metrics/lateral_dist_mean_min ▇▃▄▄▂▃▃▃▂▅▄▁▃█
wandb:                        custom_metrics/lateral_dist_mean_max ▆▃▅▆▄▄▅▁█▆▂▁▇▄
wandb:                            custom_metrics/steering_max_mean ▁█▇█▇▇▇▆▇█▇█▇█
wandb:                             custom_metrics/steering_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             custom_metrics/steering_max_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           custom_metrics/steering_mean_mean ▇▆▆█▇▅▅▁▇▅▄▅▄▅
wandb:                            custom_metrics/steering_mean_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                            custom_metrics/steering_mean_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                            custom_metrics/steering_min_mean █▂▃▂▂▂▂▁▃▂▃▂▃▂
wandb:                             custom_metrics/steering_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             custom_metrics/steering_min_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/acceleration_min_mean █▂▂▂▂▁▁▁▂▁▃▂▂▂
wandb:                         custom_metrics/acceleration_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/acceleration_min_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       custom_metrics/acceleration_mean_mean ▅▆▃▆█▃▁▅▆▆▇▆▅▅
wandb:                        custom_metrics/acceleration_mean_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/acceleration_mean_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/acceleration_max_mean ▁█▆▇█▇▆▇▇█▇█▇▇
wandb:                         custom_metrics/acceleration_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/acceleration_max_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/step_reward_max_mean █▆▁▅▂▄▃▃▂▃▄▃▃▂
wandb:                          custom_metrics/step_reward_max_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          custom_metrics/step_reward_max_max █▇▅▇▁▇▂▇▃▇▅▇▂▇
wandb:                        custom_metrics/step_reward_mean_mean █▁▂▁▁▂▂▁▁▂▁▂▁▂
wandb:                         custom_metrics/step_reward_mean_min ▅▄▆▄▆▂█▃▄▆▁▇▂▆
wandb:                         custom_metrics/step_reward_mean_max ██▁█▂█▁█▁█▁█▁█
wandb:                         custom_metrics/step_reward_min_mean █▁▃▂▂▃▂▁▃▃▃▃▂▂
wandb:                          custom_metrics/step_reward_min_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          custom_metrics/step_reward_min_max ██▁█▁█▁█▁█▁█▁█
wandb:                                    custom_metrics/cost_mean ▁█▄▆▆▇▄▇▅▆▇▆▅▆
wandb:                                     custom_metrics/cost_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                     custom_metrics/cost_max ▁▆▂▂▂█▁▃▂▆▄▅▂▅
wandb:                       custom_metrics/num_crash_vehicle_mean ▁▇▃▅▄█▃▇▃▅▆▅▄▆
wandb:                        custom_metrics/num_crash_vehicle_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/num_crash_vehicle_max ▂▅▂▂▃█▁▅▂▇▅▇▁▇
wandb:                        custom_metrics/num_crash_object_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/num_crash_object_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/num_crash_object_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                         custom_metrics/num_crash_human_mean ▁█▅▇▇▅▆▆▆▅▇▅▇▄
wandb:                          custom_metrics/num_crash_human_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          custom_metrics/num_crash_human_max ▂▄▆▆▄▆▅▄█▄▇▂▆▁
wandb:                             custom_metrics/num_on_line_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              custom_metrics/num_on_line_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              custom_metrics/num_on_line_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                     custom_metrics/step_reward_lateral_mean █▃▃▃▂▂▂▂▂▁▃▂▂▃
wandb:                      custom_metrics/step_reward_lateral_min ▅▇▇▅▃▆▆▅▁▅█▁▃█
wandb:                      custom_metrics/step_reward_lateral_max █▁████████████
wandb:                     custom_metrics/step_reward_heading_mean ▁█▇█▇█▇█▇█▇█▇█
wandb:                      custom_metrics/step_reward_heading_min ▁▇▁█▁▆▁▇▁▇▁█▁█
wandb:                      custom_metrics/step_reward_heading_max ██▁█▁█▁█▁█▁█▂█
wandb:               custom_metrics/step_reward_action_smooth_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                custom_metrics/step_reward_action_smooth_min ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                custom_metrics/step_reward_action_smooth_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/route_completion_mean ▁█▇███▇█▇█████
wandb:                         custom_metrics/route_completion_min ▁█████████████
wandb:                         custom_metrics/route_completion_max █▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        custom_metrics/curriculum_level_mean ▁█████████████
wandb:                         custom_metrics/curriculum_level_min ▁█████████████
wandb:                         custom_metrics/curriculum_level_max ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          custom_metrics/scenario_index_mean ▁█████████████
wandb:                           custom_metrics/scenario_index_min ▁█████████████
wandb:                           custom_metrics/scenario_index_max ▁█████████████
wandb:                            custom_metrics/track_length_mean ▁█▆█▇▇▇▇▆▇▆█▆▇
wandb:                             custom_metrics/track_length_min ▁▇██▇█▇█▇█▇█▇█
wandb:                             custom_metrics/track_length_max ▁█▄▇█▇█▄▇█▇█▇▄
wandb:                         custom_metrics/num_stored_maps_mean ▁▁▆███████████
wandb:                          custom_metrics/num_stored_maps_min ▁▁▄▇██████████
wandb:                          custom_metrics/num_stored_maps_max █▁████████████
wandb:                     custom_metrics/scenario_difficulty_mean ▁█▇█▇█▇█▇█▇█▇█
wandb:                      custom_metrics/scenario_difficulty_min ▁█████████████
wandb:                      custom_metrics/scenario_difficulty_max ▁█████████████
wandb:                           custom_metrics/data_coverage_mean ▁▇████████████
wandb:                            custom_metrics/data_coverage_min ▁▇████████████
wandb:                            custom_metrics/data_coverage_max ▁▅████████████
wandb:                      custom_metrics/curriculum_success_mean ▆▁▆███████████
wandb:                       custom_metrics/curriculum_success_min ▁▁▂▆██████████
wandb:                       custom_metrics/curriculum_success_max █▁▄▄▄▄▄▄▄▄▄▄▄▄
wandb:             custom_metrics/curriculum_route_completion_mean ▁▄▆███████████
wandb:              custom_metrics/curriculum_route_completion_min ▁█████████████
wandb:              custom_metrics/curriculum_route_completion_max █▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                               sampler_perf/mean_env_wait_ms █▄▃▂▂▂▂▁▁▁▁▁▁▁
wandb:                     sampler_perf/mean_raw_obs_processing_ms █▄▃▂▂▂▂▁▁▁▁▁▁▁
wandb:                              sampler_perf/mean_inference_ms █▃▃▂▁▁▁▁▂▂▂▂▂▂
wandb:                      sampler_perf/mean_action_processing_ms █▁▃▁▂▂▂▂▃▃▃▂▃▃
wandb:                                       timers/sample_time_ms █▄▃▃▂▂▂▂▂▂▁▁▁▁
wandb:                                    timers/sample_throughput ▁▁▁▁▁▁▁▂▂▂████
wandb:                                         timers/load_time_ms █▄▃▂▂▂▂▂▂▁▁▁▁▁
wandb:                                      timers/load_throughput ▁▃▄▅▆▆▆▆▆▇███▇
wandb:                                        timers/learn_time_ms █▂▃▂▁▁▁▂▃▃▃▄▄▅
wandb:                                     timers/learn_throughput ▁▇▆▇███▆▆▆▆▅▅▄
wandb:                                       timers/update_time_ms █▄▃▂▂▁▁▁▁▂▁▁▁▂
wandb:                                      info/num_steps_sampled ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                      info/num_steps_trained ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                          config/num_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/num_envs_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              config/rollout_fragment_length ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                             config/num_gpus ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                     config/train_batch_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                                config/gamma ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                              config/horizon ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         config/soft_horizon ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                       config/no_done_at_end ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    config/normalize_actions ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         config/clip_actions ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                                   config/lr ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                              config/monitor ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                               config/ignore_worker_failures ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/log_sys_usage ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         config/fake_sampler ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/eager_tracing ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/no_eager_on_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                              config/explore ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/evaluation_interval ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              config/evaluation_num_episodes ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/in_evaluation ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                               config/evaluation_num_workers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         config/sample_async ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             config/_use_trajectory_view_api ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/synchronize_filters ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                config/compress_observations ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              config/collect_metrics_timeout ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           config/metrics_smoothing_episodes ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                   config/remote_worker_envs ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             config/remote_env_batch_wait_ms ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                      config/min_iter_time_s ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                              config/timesteps_per_iteration ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                                 config/seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/num_cpus_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/num_gpus_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/num_cpus_for_driver ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                               config/memory ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/object_store_memory ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    config/memory_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       config/object_store_memory_per_worker ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                   config/postprocess_inputs ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/shuffle_buffer_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                 config/output_max_file_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                               config/replay_sequence_length ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                           config/use_critic ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                              config/use_gae ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                               config/lambda ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                             config/kl_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                   config/sgd_minibatch_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    config/shuffle_sequences ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                         config/num_sgd_iter ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                      config/vf_share_layers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/vf_loss_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/entropy_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                           config/clip_param ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                        config/vf_clip_param ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                            config/kl_target ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                     config/simple_optimizer ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                           config/_fake_gpus ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                       perf/cpu_util_percent ▁█▇▇█▇▇▇▇▇▇▇██
wandb:                                       perf/ram_util_percent █▁▂▂▃▃▃▃▄▄▄▄▄▅
wandb:                                   config/model/free_log_std ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                config/model/no_final_linear ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                config/model/vf_share_layers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                       config/model/use_lstm ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    config/model/max_seq_len ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                 config/model/lstm_cell_size ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                    config/model/lstm_use_prev_action_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                    config/model/_time_major ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                     config/model/framestack ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                            config/model/dim ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                      config/model/grayscale ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                      config/model/zero_mean ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                      config/env_config/start_scenario_index ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                             config/env_config/num_scenarios ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           config/env_config/sequential_seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          config/env_config/curriculum_level ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       config/env_config/target_success_rate ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          config/env_config/reactive_traffic ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        config/env_config/no_static_vehicles ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                  config/env_config/no_light ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                     config/env_config/static_traffic_object ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                            config/env_config/driving_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                    config/env_config/steering_range_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           config/env_config/heading_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                           config/env_config/lateral_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                        config/env_config/no_negative_reward ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                      config/env_config/on_lane_line_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                     config/env_config/crash_vehicle_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       config/env_config/crash_human_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       config/env_config/out_of_road_penalty ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          config/env_config/max_lateral_dist ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:         config/tf_session_args/intra_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:         config/tf_session_args/inter_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                 config/tf_session_args/log_device_placement ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                 config/tf_session_args/allow_soft_placement ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:   config/local_tf_session_args/intra_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:   config/local_tf_session_args/inter_op_parallelism_threads ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                    info/learner/default_policy/cur_kl_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                          info/learner/default_policy/cur_lr ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                      info/learner/default_policy/total_loss ▂▆▃▃▂█▁▃▃▅▄▁▂▄
wandb:                     info/learner/default_policy/policy_loss █▇▇▆▅▇▅▃▃▄▃▁▁▁
wandb:                         info/learner/default_policy/vf_loss ▂▆▃▃▂█▁▃▃▅▄▁▂▄
wandb:                info/learner/default_policy/vf_explained_var ▂▁▅▄▄▃▆▆▆▅▅█▇▆
wandb:                              info/learner/default_policy/kl ▂▄▁▂▁▃▁▅▂▅▇▅▅█
wandb:                         info/learner/default_policy/entropy ▁▄▃▄▄▅▄▄▅▆█▇██
wandb:                   info/learner/default_policy/entropy_coeff ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:    config/evaluation_config/env_config/start_scenario_index ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:           config/evaluation_config/env_config/num_scenarios ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:         config/evaluation_config/env_config/sequential_seed ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:        config/evaluation_config/env_config/curriculum_level ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:             config/tf_session_args/gpu_options/allow_growth ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                     config/tf_session_args/device_count/CPU ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                       config/logger_config/wandb/log_config ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:   config/env_config/vehicle_config/side_detector/num_lasers ▁▁▁▁▁▁▁▁▁▁▁▁▁▁
wandb:                                                    _runtime ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                                  _timestamp ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb:                                                       _step ▁▂▂▃▃▄▄▅▅▆▆▇▇█
wandb: 
wandb: Synced 5 W&B file(s), 0 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: 
wandb: Synced TEST_aa36c_00000: https://wandb.ai/deep-learning-for-av/scenarionet/runs/aa36c_00000
== Status ==
Memory usage on this node: 2.6/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0.0/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 ERROR)
+------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+
| Trial name                               | status   | loc   |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |   out |   max_step |   length |   level |
|------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | ERROR    |       |      0 |     14 |          24023.6 | 728000 | -1.42022 |   0.48913 |     0.1152 | 0.125 |    0.38587 |  292.761 |      13 |
+------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+
Number of errored trials: 1
+------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Trial name                               |   # failures | error file                                                                                                                              |
|------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 |            1 | /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST/MultiWorkerPPO_GymEnvWrapper_aa36c_00000_0_seed=0_2024-02-15_05-54-21/error.txt |
+------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+

== Status ==
Memory usage on this node: 2.6/51.0 GiB
Using FIFO scheduling algorithm.
Resources requested: 0.0/8 CPUs, 0/0 GPUs, 0.0/26.95 GiB heap, 0.0/9.28 GiB objects
Result logdir: /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST
Number of trials: 1 (1 ERROR)
+------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+
| Trial name                               | status   | loc   |   seed |   iter |   total time (s) |     ts |   reward |   success |   coverage |   out |   max_step |   length |   level |
|------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 | ERROR    |       |      0 |     14 |          24023.6 | 728000 | -1.42022 |   0.48913 |     0.1152 | 0.125 |    0.38587 |  292.761 |      13 |
+------------------------------------------+----------+-------+--------+--------+------------------+--------+----------+-----------+------------+-------+------------+----------+---------+
Number of errored trials: 1
+------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Trial name                               |   # failures | error file                                                                                                                              |
|------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------|
| MultiWorkerPPO_GymEnvWrapper_aa36c_00000 |            1 | /content/drive/MyDrive/mdsn/scenarionet/experiment/TEST/MultiWorkerPPO_GymEnvWrapper_aa36c_00000_0_seed=0_2024-02-15_05-54-21/error.txt |
+------------------------------------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------+

Traceback (most recent call last):
  File "scenarionet_training/scripts/train_waymo.py", line 83, in <module>
    train(
  File "/content/drive/MyDrive/mdsn/scenarionet/scenarionet_training/train_utils/utils.py", line 166, in train
    analysis = tune.run(
  File "/usr/local/envs/scenarionet/lib/python3.8/site-packages/ray/tune/tune.py", line 427, in run
    raise TuneError("Trials did not complete", incomplete_trials)
ray.tune.error.TuneError: ('Trials did not complete', [MultiWorkerPPO_GymEnvWrapper_aa36c_00000])

I solved problem 1 (Because I forgot to add a test set), But problem2 still remained.

At the beginning of training (which probably lasted nearly a day), success rates, rewards, and scene length didn't seem to change significantly. Is this reasonable?
截屏2024-02-22 21 49 03
截屏2024-02-22 21 50 31
截屏2024-02-22 21 52 21
截屏2024-02-22 21 55 21

That's weird. It should at least show some improvement. It seems you are running PPO.
Please:

  1. try tuning the training on a single scenario with a long ego-car trajectory. Some Waymo scenarios have a pretty short trajectory with only 5-10m displacement, which is too easy. I guess the scenario you are training has a very short trajectory, so it can have some success rate without moving forward. You can use the keyboard controller and control the car on your own with show_navi_mark=True and manual_control=True to follow the trajectory and verify this. Also, you are supposed to find that no rear collision happens as the reactive_traffic should be set to True.
  2. set no_traffic=True can remove the influence of the surrounding vehicles. Then the task is reduced to a simple trajectory following task. If it works in this way, we may need to investigate if something is wrong with the traffic.
  3. How many works are you using? Your sample efficiency is really low... You can find more statistics about the sampling time and evaluation time. I guess your evaluation takes too much time. As you are testing on one scenario, just set evaluation_num_episodes=1, evaluation_num_workers=1. If sampling take more time, you should increase the number of works.

Recently, iI was possible to use a device with 256GB of RAM (Windows operating system). However, this presents some new problems. , How could I solve it?

python train_waymo.py --num-gpus 0
F0320 22:30:23.563987 21692  7620 raylet_client.cc:108]  Check failed: _s.ok() [RayletClient] Unable to register worker with raylet.: IOError: Ray cookie mismatch for received message. Received cookie: 68681728
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A231C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx
    @   00007FFEBEF7A91D  PyEval_EvalCode
F0320 22:30:46.976624 36936 21108 raylet_client.cc:54] Could not connect to socket tcp://127.0.0.1:64535
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A2ACB  PyInit__raylet
    @   00007FFE951A160C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx
F0320 22:30:47.143532 34916 15208 raylet_client.cc:54] Could not connect to socket tcp://127.0.0.1:63679
*** Check failure stack trace: ***
    @   00007FFE9538174B  public: void __cdecl google::LogMessage::Flush(void) __ptr64
    @   00007FFE953804E2  public: __cdecl google::LogMessage::~LogMessage(void) __ptr64
    @   00007FFE953494F8  public: virtual __cdecl google::NullStreamFatal::~NullStreamFatal(void) __ptr64
    @   00007FFE951A2ACB  PyInit__raylet
    @   00007FFE951A160C  PyInit__raylet
    @   00007FFE9512BFE2  PyInit__raylet
    @   00007FFE95141EEC  PyInit__raylet
    @   00007FFE9512F36F  PyInit__raylet
    @   00007FFE95154937  PyInit__raylet
    @   00007FFE950C84C1  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950EAAB7  public: virtual void __cdecl google::LogSink::WaitTillSent(void) __ptr64
    @   00007FFE950541C2  (unknown)
    @   00007FFEBEFB2892  _Py_CheckFunctionResult
    @   00007FFEBEFB4C8B  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB0F1F  _PyFunction_Vectorcall
    @   00007FFEBEF89E36  PyVectorcall_Call
    @   00007FFEBEF89CC7  PySequence_GetItem
    @   00007FFEBEFB5B05  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEFB2D16  _Py_CheckFunctionResult
    @   00007FFEBEFB50C2  _PyEval_EvalFrameDefault
    @   00007FFEBEFAFFB8  _PyEval_EvalCodeWithName
    @   00007FFEBEF7A9BF  PyEval_EvalCodeEx

Sorry, I have no idea. It is something raised by Ray. How many workers are you using? Does this still persist if you only use one worker?

ERROR syncer.py:63 -- Log sync requires rsync to be installed.

Is the reason related to the lack of rsync in Windows?

Not sure. You can search related stuff in Ray's GitHub issue list.