AsyncInference only running one action chunk

I have my SO101 arms connected to my computer, and I'm running an asynchronous server on a cloud GPU with a RTX 4090.

When I start running Pi0.5, the model is loaded and the SO101 makes its first move by setting the robot to be at its middle position, but then no further actions are made although the server logs new observations and action sequences being generated.

The robot moves to this position and doesn't move further:

I have one wrist camera and one top-down view camera. Here is my client command:

python3 -m lerobot.async_inference.robot_client \
    --server_address=ip:port \
    --robot.type=so101_follower \
    --robot.port=/dev/ttyACM0 \
    --robot.id=arm \
    --robot.cameras="{ base_0_rgb: {type: opencv, index_or_path: \"/dev/video2\", width: 640, height: 480, fps: 30}, left_wrist_0_rgb: {type: opencv, index_or_path: \"/dev/video0\", width: 640, height: 480, fps: 30}}" \
    --policy_device=cuda \
    --aggregate_fn_name=weighted_average \
    --debug_visualize_queue_size=True \
    --task="Pick up the orange and place it on the plate" \
    --policy_type=pi05 \
    --pretrained_name_or_path=lerobot/pi05_base \
    --actions_per_chunk=50 \
    --chunk_size_threshold=0.0 \
    --debug_visualize_queue_size=True

Here are my server logs:

(lerobot) root@eff66f201198:/workspace/arm-x64# ./robot.sh runpod async-server
INFO 2025-11-01 20:17:34 y_server.py:421 {'fps': 30,
 'host': '0.0.0.0',
 'inference_latency': 0.03333333333333333,
 'obs_queue_timeout': 2,
 'port': 8080}
INFO 2025-11-01 20:17:34 y_server.py:431 PolicyServer started on 0.0.0.0:8080
INFO 2025-11-01 20:18:03 y_server.py:112 Client ipv4:129.97.131.28:23025 connected and ready
INFO 2025-11-01 20:18:03 y_server.py:138 Receiving policy instructions from ipv4:129.97.131.28:23025 | Policy type: pi05 | Pretrained name or path: lerobot/pi05_base | Actions per chunk: 50 | Device: cuda
The PI05 model is a direct port of the OpenPI implementation. 
This implementation follows the original OpenPI structure for compatibility. 
Original implementation: https://github.com/Physical-Intelligence/openpi
INFO 2025-11-01 20:18:03 ils/utils.py:43 Cuda backend detected, using cuda.
WARNING 2025-11-01 20:18:03 /policies.py:82 Device 'mps' is not available. Switching to 'cuda'.
INFO 2025-11-01 20:18:03 ils/utils.py:43 Cuda backend detected, using cuda.
WARNING 2025-11-01 20:18:03 /policies.py:82 Device 'mps' is not available. Switching to 'cuda'.
Loading model from: lerobot/pi05_base
✓ Loaded state dict from model.safetensors
WARNING 2025-11-01 20:19:08 ng_pi05.py:1023 Vision embedding key might need handling: paligemma_with_expert.paligemma.model.vision_tower.vision_model.embeddings.patch_embedding.bias
WARNING 2025-11-01 20:19:08 ng_pi05.py:1023 Vision embedding key might need handling: paligemma_with_expert.paligemma.model.vision_tower.vision_model.embeddings.patch_embedding.weight
Remapped: action_in_proj.bias -> model.action_in_proj.bias
Remapped: action_in_proj.weight -> model.action_in_proj.weight
Remapped: action_out_proj.bias -> model.action_out_proj.bias
Remapped: action_out_proj.weight -> model.action_out_proj.weight
Remapped: paligemma_with_expert.gemma_expert.lm_head.weight -> model.paligemma_with_expert.gemma_expert.lm_head.weight
Remapped: paligemma_with_expert.gemma_expert.model.layers.0.input_layernorm.dense.bias -> model.paligemma_with_expert.gemma_expert.model.layers.0.input_layernorm.dense.bias
Remapped: paligemma_with_expert.gemma_expert.model.layers.0.input_layernorm.dense.weight -> model.paligemma_with_expert.gemma_expert.model.layers.0.input_layernorm.dense.weight
Remapped: paligemma_with_expert.gemma_expert.model.layers.0.mlp.down_proj.weight -> model.paligemma_with_expert.gemma_expert.model.layers.0.mlp.down_proj.weight
Remapped: paligemma_with_expert.gemma_expert.model.layers.0.mlp.gate_proj.weight -> model.paligemma_with_expert.gemma_expert.model.layers.0.mlp.gate_proj.weight
Remapped: paligemma_with_expert.gemma_expert.model.layers.0.mlp.up_proj.weight -> model.paligemma_with_expert.gemma_expert.model.layers.0.mlp.up_proj.weight
Remapped 812 state dict keys
Warning: Could not remap state dict keys: Error(s) in loading state_dict for PI05Policy:
	Missing key(s) in state_dict: "model.paligemma_with_expert.paligemma.model.language_model.embed_tokens.weight". 
INFO 2025-11-01 20:19:43 y_server.py:171 Time taken to put policy on cuda: 99.9787 seconds
INFO 2025-11-01 20:19:43 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:20:02 y_server.py:226 Running inference for observation #0 (must_go: True)
INFO 2025-11-01 20:20:03 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:20:04 y_server.py:362 Preprocessing and inference took 1.3530s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:20:04 y_server.py:392 Observation 0 | Total time: 1459.15ms
INFO 2025-11-01 20:20:04 y_server.py:244 Action chunk #0 generated | Total time: 1466.28ms
INFO 2025-11-01 20:20:20 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:20:38 y_server.py:226 Running inference for observation #49 (must_go: True)
INFO 2025-11-01 20:20:38 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:20:38 y_server.py:362 Preprocessing and inference took 0.3568s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:20:38 y_server.py:392 Observation 49 | Total time: 374.91ms
INFO 2025-11-01 20:20:38 y_server.py:244 Action chunk #49 generated | Total time: 380.87ms
INFO 2025-11-01 20:21:00 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:21:21 y_server.py:226 Running inference for observation #98 (must_go: True)
INFO 2025-11-01 20:21:21 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:21:22 y_server.py:362 Preprocessing and inference took 0.3511s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:21:22 y_server.py:392 Observation 98 | Total time: 365.69ms
INFO 2025-11-01 20:21:22 y_server.py:244 Action chunk #98 generated | Total time: 371.43ms
INFO 2025-11-01 20:21:45 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:22:06 y_server.py:226 Running inference for observation #147 (must_go: True)
INFO 2025-11-01 20:22:06 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:22:06 y_server.py:362 Preprocessing and inference took 0.3569s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:22:06 y_server.py:392 Observation 147 | Total time: 374.18ms
INFO 2025-11-01 20:22:06 y_server.py:244 Action chunk #147 generated | Total time: 379.72ms
INFO 2025-11-01 20:22:28 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:22:48 y_server.py:226 Running inference for observation #196 (must_go: True)
INFO 2025-11-01 20:22:48 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:22:49 y_server.py:362 Preprocessing and inference took 0.3505s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:22:49 y_server.py:392 Observation 196 | Total time: 367.10ms
INFO 2025-11-01 20:22:49 y_server.py:244 Action chunk #196 generated | Total time: 375.31ms
INFO 2025-11-01 20:23:08 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:23:28 y_server.py:226 Running inference for observation #245 (must_go: True)
INFO 2025-11-01 20:23:28 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:23:28 y_server.py:362 Preprocessing and inference took 0.3389s, action shape: torch.Size([1, 50, 32])                                                                                                                  INFO 2025-11-01 20:23:28 y_server.py:392 Observation 245 | Total time: 358.84ms
INFO 2025-11-01 20:23:28 y_server.py:244 Action chunk #245 generated | Total time: 364.29ms
INFO 2025-11-01 20:23:49 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:24:12 y_server.py:226 Running inference for observation #294 (must_go: True)
INFO 2025-11-01 20:24:12 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:24:12 y_server.py:362 Preprocessing and inference took 0.3514s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:24:12 y_server.py:392 Observation 294 | Total time: 370.10ms
INFO 2025-11-01 20:24:12 y_server.py:244 Action chunk #294 generated | Total time: 376.08ms
INFO 2025-11-01 20:24:34 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:24:52 y_server.py:226 Running inference for observation #343 (must_go: True)
INFO 2025-11-01 20:24:52 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:24:53 y_server.py:362 Preprocessing and inference took 0.3596s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:24:53 y_server.py:392 Observation 343 | Total time: 377.54ms
INFO 2025-11-01 20:24:53 y_server.py:244 Action chunk #343 generated | Total time: 384.55ms
INFO 2025-11-01 20:25:13 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:25:35 y_server.py:226 Running inference for observation #392 (must_go: True)
INFO 2025-11-01 20:25:35 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:25:35 y_server.py:362 Preprocessing and inference took 0.3451s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:25:35 y_server.py:392 Observation 392 | Total time: 360.65ms
INFO 2025-11-01 20:25:35 y_server.py:244 Action chunk #392 generated | Total time: 366.04ms
INFO 2025-11-01 20:25:56 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:26:15 y_server.py:226 Running inference for observation #441 (must_go: True)
INFO 2025-11-01 20:26:15 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:26:16 y_server.py:362 Preprocessing and inference took 0.3382s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:26:16 y_server.py:392 Observation 441 | Total time: 358.57ms
INFO 2025-11-01 20:26:16 y_server.py:244 Action chunk #441 generated | Total time: 364.56ms
INFO 2025-11-01 20:26:36 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:26:55 y_server.py:226 Running inference for observation #490 (must_go: True)
INFO 2025-11-01 20:26:55 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:26:55 y_server.py:362 Preprocessing and inference took 0.3517s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:26:55 y_server.py:392 Observation 490 | Total time: 373.93ms
INFO 2025-11-01 20:26:55 y_server.py:244 Action chunk #490 generated | Total time: 379.56ms
INFO 2025-11-01 20:27:20 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:27:39 y_server.py:226 Running inference for observation #539 (must_go: True)
INFO 2025-11-01 20:27:39 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:27:39 y_server.py:362 Preprocessing and inference took 0.3280s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:27:39 y_server.py:392 Observation 539 | Total time: 349.91ms
INFO 2025-11-01 20:27:39 y_server.py:244 Action chunk #539 generated | Total time: 355.96ms
INFO 2025-11-01 20:27:59 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:28:23 y_server.py:226 Running inference for observation #588 (must_go: True)
INFO 2025-11-01 20:28:23 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:28:23 y_server.py:362 Preprocessing and inference took 0.3386s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:28:23 y_server.py:392 Observation 588 | Total time: 355.34ms
INFO 2025-11-01 20:28:23 y_server.py:244 Action chunk #588 generated | Total time: 360.75ms
INFO 2025-11-01 20:28:46 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:29:04 y_server.py:226 Running inference for observation #637 (must_go: True)
INFO 2025-11-01 20:29:04 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:29:04 y_server.py:362 Preprocessing and inference took 0.3545s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:29:04 y_server.py:392 Observation 637 | Total time: 370.41ms
INFO 2025-11-01 20:29:04 y_server.py:244 Action chunk #637 generated | Total time: 375.78ms
INFO 2025-11-01 20:29:26 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:29:48 y_server.py:226 Running inference for observation #686 (must_go: True)
INFO 2025-11-01 20:29:48 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver
INFO 2025-11-01 20:29:49 y_server.py:362 Preprocessing and inference took 0.3514s, action shape: torch.Size([1, 50, 32])
INFO 2025-11-01 20:29:49 y_server.py:392 Observation 686 | Total time: 369.69ms
INFO 2025-11-01 20:29:49 y_server.py:244 Action chunk #686 generated | Total time: 375.07ms
INFO 2025-11-01 20:30:12 ort/utils.py:74 <Logger policy_server (NOTSET)> Starting receiver

With chunk_size_threshold=0.0 (I ran into the same issue with 0.5 chunk_size_threshold), my action queue size looks like this:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AsyncInference only running one action chunk #2356

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AsyncInference only running one action chunk #2356

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions