Skip to content

LIBERO Evaluation

LIBERO is a tabletop robotic manipulation benchmark with 4 task suites (Spatial, Object, Goal, Long Horizon), totaling 40 tasks. It tests VLA models on spatial understanding, object recognition, goal reasoning, and long-horizon manipulation using a Franka robotic arm.

This document provides instructions for reproducing our experimental results with LIBERO. The evaluation process consists of two main parts:

  1. Setting up the LIBERO environment and dependencies.
  2. Running the evaluation by launching services in both starVLA and LIBERO environments.

We have verified that this workflow runs successfully on both NVIDIA A100 and RTX 4090 GPUs.


We provide a collection of pretrained checkpoints on Hugging Face to make community evaluation easier: 🤗 StarVLA/bench-libero. Their corresponding results on LIBERO are summarized in the table below.

ModelStepsEpochsSpatialObjectGoalLongAvg
$\pi_0$+FAST--96.496.888.660.285.5
OpenVLA-OFT175K22397.698.497.994.597.1
$\pi_0$--96.898.895.885.294.1
GR00T-N1.520K20392.092.086.076.086.5
Qwen2.5-VL-FAST30K9.5497.397.296.190.295.2
Qwen2.5-VL-OFT30K9.5497.498.096.892.096.1
Qwen2.5-VL-GR00T30K9.5497.898.294.690.895.4
Qwen3-VL-FAST30K9.5497.397.496.390.695.4
Qwen3-VL-OFT30K9.5497.898.696.293.896.6
Qwen3-VL-GR00T30K9.5497.898.897.492.096.5

We train one policy for all 4 suites. All scores are averaged over 500 trials for each task suite (10 tasks × 50 episodes).


To set up the environment, please first follow the official LIBERO repository to install the base LIBERO environment.

⚠️ Common issue: LIBERO defaults to Python 3.8, but the syntax updates between 3.8 and 3.10 are substantial. We verified that using Python 3.10 avoids many issues.

Afterwards, inside the LIBERO environment, install the following dependencies:

Terminal window
pip install tyro matplotlib mediapy websockets msgpack
pip install numpy==1.24.4 # Downgrade numpy for compatibility with the simulation environment

Run the evaluation from the starVLA repository root using two separate terminals, one for each environment.

  • starVLA environment: runs the inference server.
  • LIBERO environment: runs the simulation.

Step 1. Start the server (starVLA environment)

Section titled “Step 1. Start the server (starVLA environment)”

In the first terminal, activate the starVLA conda environment and run:

Terminal window
bash examples/LIBERO/eval_files/run_policy_server.sh

⚠️ Note: Please ensure that you specify the correct checkpoint path in examples/LIBERO/eval_files/run_policy_server.sh


Step 2. Start the simulation (LIBERO environment)

Section titled “Step 2. Start the simulation (LIBERO environment)”

In the second terminal, activate the LIBERO conda environment and run:

Terminal window
bash examples/LIBERO/eval_files/eval_libero.sh

⚠️ Note: Make sure you correctly set the following variables in eval_libero.sh:

VariableMeaningExample
LIBERO_HOMEPath to your LIBERO repo clone/path/to/LIBERO
LIBERO_PythonPython path from the LIBERO conda env$(which python) (inside LIBERO env)
your_ckptStarVLA checkpoint path./results/Checkpoints/.../steps_30000_pytorch_model.pt
unnorm_keyRobot type name for loading unnormalization statsfranka (LIBERO uses Franka arm)

unnorm_key is used to load normalization statistics (min/max, etc.) saved during training, converting normalized model outputs back to actual joint angles.

Finally, each result will also save a video for visualization, as shown below:

Example


Download the datasets to the playground/Datasets/LEROBOT_LIBERO_DATA directory:

And move modality.json to each $LEROBOT_LIBERO_DATA/subset/meta/modality.json.

You could quickly prepare these by running:

Terminal window
# Set DEST to the directory where you want to store the data
export DEST=/path/to/your/data/directory
bash examples/LIBERO/data_preparation.sh

Most of the required training files have been organized in examples/LIBERO/train_files/.

Run the following command to start training:

Terminal window
bash examples/LIBERO/train_files/run_libero_train.sh

⚠️ Note: Please ensure that you specify the correct path in examples/LIBERO/train_files/run_libero_train.sh