Observation

_images/observation_demo.png

MetaDrive provides various kinds of sensory input, as illustrated in the next figure. For low-level sensors, RGB cameras, depth cameras and Lidar can be placed anywhere in the scene with adjustable parameters such as view field and the laser number. Meanwhile, the high-level scene information including the road information and nearby vehicles’ information like velocity and heading can also be provided as the observation.

Note that MetaDrive aims at providing an efficient platform to benchmark RL research, therefore we improve the simulation efficiency at the cost of photorealistic rendering effect.

In this page, we describe the optional observation forms in current MetaDrive version and discuss how to implement new forms of observation subject to your own tasks.

Existing Observation

State Vector

MetaDrive provides a state vector containing necessary information to navigation tasks. We use this state vector in almost all existing RL experiments such as the Generalization, MARL and Safe RL experiments.

The state vector consist of three parts:

  1. Ego State: current states such as the steering, heading, velocity and relative distance to boundaries, implemented in the vehicle_state function of StateObservation Class. Please find the detailed meaning of each state dimension in the code.

  2. Navigation: the navigation information that guides the vehicle toward the destination. Concretely, MetaDrive first computes the route from the spawn point to the destination of the ego vehicle. Then a set of checkpoints are scattered across the whole route with certain intervals. The relative distance and direction to the next checkpoint and the next next checkpoint will be given as the navigation information. This part is implemented in the _get_info_for_checkpoint function of Navigation Class.

  3. Surrounding: the surrounding information is encoded by a vector containing the Lidar-like cloud points. The data is generated by the Lidar Class. We typically use 240 lasers (single-agent) and 70 lasers (multi-agent) to scan the neighboring area with radius 50 meters.

The above information is concatenated into a state vector by the LidarStateObservation Class and fed to the RL agents.

Top-down Semantic Maps

_images/top_down_obs.png

MetaDrive also supports Top-down semantic maps. We provide a handy example to illustrate the utilization of Top-down observation in top_down_metadrive.py. You can enjoy this demo via:

python -m metadrive.examples.top_down_metadrive

The following is a minimal script to use Top-down observation.

from metadrive import TopDownMetaDrive

env = TopDownMetaDrive()
o = env.reset()
for i in range(1, 100000):
    o, r, d, info = env.step([0, 1])
    env.render(mode="top_down")
    if d:
        env.reset()
env.close()

The TopDownMetaDrive is a wrapper class on MetaDriveEnv which overrides observation to pygame top-down renderer. The native observation of this setting is a numpy array with shape [84, 84, 5] and all entries fall into [0, 1]. The above figure shows the semantic meaning of each channel.

Use First-view Images in Training

_images/rgb_obs.png _images/depth_obs.jpg

MetaDrive supports visuomotor tasks by turning on the rendering during the training. The above figure shows the images captured by RGB camera (left) and depth camera (right). In this section, we discuss how to utilize such observation in a headless machine, such as computing node in cluster or other remote server. Before using such function in your project, please make sure the offscreen rendering is working in your machine. The setup tutorial is at Install MetaDrive with headless rendering.

Now we can setup the vision-based observation in MetaDrive:

  • Step 1. Set the config["image_observation"] = True to tell MetaDrive maintaining a image buffer in memory even no popup window exists.

  • Step 2. Set the config["vehicle_config"]["image_source"] to "rgb_camera" or "depth_camera" according to your demand.

  • Step 3. The image size (width and height) will be determined by the camera parameters. The default setting is (84, 84) following the image size in Atari. You can customize the size by configuring config["vehicle_config"]["rgb_camera"]. For example, config["vehicle_config"]["rgb_camera"] = (200, 88) means that the image has 200 pixels in width and 88 pixels in height.

There is a demo script using RGB camera as observation:

python -m metadrive.examples.drive_in_single_agent_env --observation rgb_camera

The script should print a message:

The observation is a dict with numpy arrays as values:  {'image': (84, 84, 3), 'state': (21,)}

The image rendering consumes memory in the first GPU of your machine (if any). Please be careful when using this.

If you feel the visual data collection is slow, why not try our advanced offscreen render: Install MetaDrive with advanced offscreen rendering. After verifying your installation, set config[“image_on_cuda”] = True to get 10x faster data collection!