Demonstration on MultigoalIntersection
In this notebook, we demonstrate how to setup a multigoal intersection environment where you can access relevant stats (e.g. route completion, reward, success rate) for all four possible goals (right turn, left turn, move forward, U turn) simultaneously.
We demonstrate how to build the environment, in which we have successfully trained a SAC expert that achieves 99% success rate, and how to access those stats in the info dict returned each step.
Note: We pretrain the SAC expert with use_multigoal_intersection=False and then finetune it with use_multigoal_intersection=True.
import numpy as np
from metadrive.envs.gym_wrapper import create_gym_wrapper
from metadrive.envs.multigoal_intersection import MultiGoalIntersectionEnv
import mediapy as media
render = False
num_scenarios = 1000
start_seed = 100
env_config = dict(
use_render=render,
manual_control=False,
horizon=500, # to speed up training
traffic_density=0.06,
use_multigoal_intersection=True, # Set to False if want to use the same observation but with original PG scenarios.
out_of_route_done=False,
num_scenarios=num_scenarios,
start_seed=start_seed,
accident_prob=0.8,
crash_vehicle_done=False,
crash_object_done=False,
)
wrapped = create_gym_wrapper(MultiGoalIntersectionEnv)
env = wrapped(env_config)
[INFO] Environment: MultiGoalIntersectionEnv
[INFO] MetaDrive version: 0.4.3
[INFO] Sensors: [lidar: Lidar(), side_detector: SideDetector(), lane_line_detector: LaneLineDetector()]
[INFO] Render Mode: none
[INFO] Horizon (Max steps per agent): 500
frames = []
try:
env.reset()
while True:
action = [0, 1]
o, r, d, i = env.step(action)
frame = env.render(mode="topdown")
frames.append(frame)
if d:
break
finally:
env.close()
[INFO] Assets version: 0.4.3
[INFO] Known Pipes: glxGraphicsPipe
[INFO] Start Scenario Index: 100, Num Scenarios : 1000
[WARNING] env.vehicle will be deprecated soon. Use env.agent instead (base_env.py:737)
[INFO] Episode ended! Scenario Index: 542 Reason: arrive_dest.
print("Output at final step:")
i = {k: i[k] for k in sorted(i.keys())}
for k, v in i.items():
if isinstance(v, str):
s = v
elif np.iterable(v):
continue
else:
s = "{:.3f}".format(v)
print("\t{}: {}".format(k, s))
Output at final step:
acceleration: 1.000
arrive_dest: 1.000
arrive_dest/goals/default: 1.000
arrive_dest/goals/go_straight: 1.000
arrive_dest/goals/left_turn: 0.000
arrive_dest/goals/right_turn: 0.000
arrive_dest/goals/u_turn: 0.000
cost: 0.000
crash: 0.000
crash_building: 0.000
crash_human: 0.000
crash_object: 0.000
crash_sidewalk: 0.000
crash_vehicle: 0.000
current_goal: go_straight
env_seed: 542.000
episode_energy: 6.565
episode_length: 85.000
episode_reward: 122.793
max_step: 0.000
navigation_command: forward
navigation_forward: 1.000
navigation_left: 0.000
navigation_right: 0.000
out_of_road: 0.000
overtake_vehicle_num: 0.000
policy: EnvInputPolicy
reward/default_reward: 12.332
reward/goals/default: 12.332
reward/goals/go_straight: 12.332
reward/goals/left_turn: -10.000
reward/goals/right_turn: -10.000
reward/goals/u_turn: -10.000
route_completion: 0.969
route_completion/goals/default: 0.969
route_completion/goals/go_straight: 0.969
route_completion/goals/left_turn: 0.621
route_completion/goals/right_turn: 0.644
route_completion/goals/u_turn: 0.557
steering: 0.000
step_energy: 0.162
velocity: 22.291
media.show_video(frames)