Model Zoo
Available Modified Models
Section titled “Available Modified Models”| Model | Description | Link |
|---|---|---|
| Qwen2.5-VL-3B-Action | Extend Qwen2.5-VL vocabulary with fast tokens (special vocabulary extension for discretizing continuous actions into tokens) | Hugging Face |
| Qwen3-VL-4B-Action | Extend Qwen3-VL vocabulary with fast tokens (same as above) | Hugging Face |
| pi-fast | pi-fast action tokenizer weights | Hugging Face |
Finetuning Checkpoints
Section titled “Finetuning Checkpoints”SimplerEnv / Bridge
Section titled “SimplerEnv / Bridge”Bridge is a WidowX tabletop manipulation dataset; Fractal is Google’s RT-1 robot manipulation dataset.
| Model | Framework | Base VLM | Description | WidowX | Link |
|---|---|---|---|---|---|
| Qwen2.5-FAST-Bridge-RT-1 | QwenFast | Qwen2.5-VL-3B | Bridge + Fractal | 58.6 | HF |
| Qwen2.5-OFT-Bridge-RT-1 | QwenOFT | Qwen2.5-VL-3B | Bridge + Fractal | 41.8 | HF |
| Qwen2.5-PI-Bridge-RT-1 | QwenPI | Qwen2.5-VL-3B | Bridge + Fractal | 62.5 | HF |
| Qwen2.5-GR00T-Bridge-RT-1 | QwenGR00T | Qwen2.5-VL-3B | Bridge + Fractal | 63.6 | HF |
| Qwen-GR00T-Bridge | QwenGR00T | Qwen2.5-VL-3B | Bridge only | 71.4 | HF |
| Qwen3VL-OFT-Bridge-RT-1 | QwenOFT | Qwen3-VL-4B | Bridge + Fractal | 42.7 | HF |
| Qwen3VL-GR00T-Bridge-RT-1 | QwenGR00T | Qwen3-VL-4B | Bridge + Fractal | 65.3 | HF |
| Florence-GR00T-Bridge-RT-1 | QwenGR00T | Florence-2 | Bridge + Fractal (small model) | - | HF |
WidowX column: Success rate (%) on WidowX robot tasks in SimplerEnv. Higher is better.
LIBERO
Section titled “LIBERO”LIBERO has 4 task suites (Spatial, Object, Goal, Long Horizon) with 40 tasks total. All checkpoints are trained jointly on all 4 suites. See LIBERO evaluation docs.
| Model | Framework | Base VLM | Link |
|---|---|---|---|
| Qwen2.5-VL-FAST-LIBERO-4in1 | QwenFast | Qwen2.5-VL-3B | HF |
| Qwen2.5-VL-OFT-LIBERO-4in1 | QwenOFT | Qwen2.5-VL-3B | HF |
| Qwen2.5-VL-GR00T-LIBERO-4in1 | QwenGR00T | Qwen2.5-VL-3B | HF |
| Qwen3-VL-OFT-LIBERO-4in1 | QwenOFT | Qwen3-VL-4B | HF |
| Qwen3-VL-PI-LIBERO-4in1 | QwenPI | Qwen3-VL-4B | HF |
RoboCasa
Section titled “RoboCasa”RoboCasa GR1 Tabletop Tasks with 24 Pick-and-Place tasks. See RoboCasa evaluation docs.
| Model | Framework | Base VLM | Link |
|---|---|---|---|
| Qwen3-VL-GR00T-Robocasa-gr1 | QwenGR00T | Qwen3-VL-4B | HF |
| Qwen3-VL-OFT-Robocasa | QwenOFT | Qwen3-VL-4B | HF |
RoboTwin
Section titled “RoboTwin”RoboTwin 2.0 dual-arm manipulation benchmark with 50 tasks. See RoboTwin evaluation docs.
| Model | Framework | Base VLM | Link |
|---|---|---|---|
| Qwen3-VL-OFT-Robotwin2-All | QwenOFT | Qwen3-VL-4B | HF |
| Qwen3-VL-OFT-Robotwin2 | QwenOFT | Qwen3-VL-4B | HF |
BEHAVIOR-1K
Section titled “BEHAVIOR-1K”BEHAVIOR-1K household task benchmark using R1Pro humanoid robot. See BEHAVIOR evaluation docs.
| Model | Description | Link |
|---|---|---|
| BEHAVIOR-QwenDual-taskall | Jointly trained on all 50 tasks | HF |
| BEHAVIOR-QwenDual-task1 | Single-task training | HF |
| BEHAVIOR-QwenDual-task6-40k | 6-task joint training | HF |
| BEHAVIOR-rgp-seg | Segmentation observation experiment | HF |
Datasets
Section titled “Datasets”Training Datasets
Section titled “Training Datasets”| Dataset | Description | Link |
|---|---|---|
| LLaVA-OneVision-COCO | Image-text dataset for VLM co-training (ShareGPT4V-COCO subset) | HF |
| RoboTwin-Clean | RoboTwin 2.0 clean demonstrations (50 per task) | HF |
| RoboTwin-Randomized | RoboTwin 2.0 randomized demonstrations (500 per task) | HF |
| RoboTwin-Randomized-targz | Same as above, tar.gz packed format (for bulk download) | HF |
BEHAVIOR Data
Section titled “BEHAVIOR Data”| Dataset | Description | Link |
|---|---|---|
| BEHAVIOR-1K | BEHAVIOR-1K benchmark simulation configs | HF |
| BEHAVIOR-1K-datasets | BEHAVIOR-1K training datasets | HF |
| BEHAVIOR-1K-datasets-assets | BEHAVIOR-1K scene and object assets | HF |
| BEHAVIOR-1K-VISUALIZATION-DEMO | BEHAVIOR-1K visualization demos | HF |
| behavior-1k-task0 | Single-task training data sample | HF |
How to Use a Checkpoint
Section titled “How to Use a Checkpoint”Download a checkpoint and run the policy server:
# Download (requires huggingface_hub)huggingface-cli download StarVLA/Qwen3VL-GR00T-Bridge-RT-1 --local-dir ./results/Checkpoints/Qwen3VL-GR00T-Bridge-RT-1
# Start the policy serverpython deployment/model_server/server_policy.py \ # steps_XXXXX is the training step count — replace with the actual filename from your download # e.g. steps_50000_pytorch_model.pt; run `ls` to see the exact filename --ckpt_path ./results/Checkpoints/Qwen3VL-GR00T-Bridge-RT-1/checkpoints/steps_XXXXX_pytorch_model.pt \ --port 5694 \ --use_bf16Then follow the evaluation guide for the benchmark you want to test on (e.g. SimplerEnv, LIBERO, RoboCasa, RoboTwin, BEHAVIOR).