$\color{orange}{\textbf{{[CVPR 2026]}}}$ PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation
-
🤖 Unified VLA framework: PALM couples affordance reasoning with subtask progress estimation.
-
🎯 Structured affordance reasoning: PALM predicts future interaction cues for object relevance, contact, placement, and motion.
-
🏆 Strong long-horizon results: PALM achieves state-of-the-art performance on CALVIN ABC-D and LIBERO-LONG, with robust real-world generalization.
Create and activate conda environment:
conda create -n palm python=3.10 -y
conda activate palmInstall PyTorch:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1+cu117 --extra-index-url https://download.pytorch.org/whl/cu117Install dependencies:
pip install git+https://github.com/openai/CLIP.git
pip install -r requirements.txtWe provide step-by-step guidance for running PALM in simulations and real-world experiments. Follow the specific instructions for a seamless setup.
For users aiming to train PALM from scratch or fine-tune it, we provide comprehensive instructions for environment setup, downstream task data preparation, training, and deployment.
This section details the pre-training process of PALM in real-world experiments, including environment setup, dataset preparation, and training procedures. Downstream task processing and fine-tuning are covered in Real-World (Quick Training w & w/o pre-training).
Relevant checkpoints are available on Google Drive.
If you find the project helpful for your research, please consider citing our paper:
@inproceedings{liu2026palm,
title={PALM: Progress-Aware Policy Learning via Affordance Reasoning for Long-Horizon Robotic Manipulation},
author={Liu, Yuanzhe and Zhu, Jingyuan and Mo, Yuchen and Li, Gen and Cao, Xu and Jin, Jin and Shen, Yifan and Li, Zhengyuan and Yu, Tianjiao and Yuan, Wenzhen and Ding, Fangqiang and Lourentzou, Ismini},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2026}
}
