Technical Report
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments
Qwen-VLA Team
Xuhong Huang is a core contributor, responsible for data processing, real-world benchmark design, and real-robot deployment.

arXiv 2026
FineVLA: Fine-Grained Instruction Alignment for Steerable Vision-Language-Action Policies
Xintong Hu*, Xuhong Huang*, Jinyu Zhang, Yutong Yao, Yuchong Sun, Qiuyue Wang, Mingsheng Li, Sicheng Xie, Yitao Liu, Junhao Chen, Yixuan Chen, Yingming Zheng, Shuai Bai, Tao Yu
* Equal Contribution

ICCV 2025
Reverse Convolution and Its Application to Image Restoration
Xuhong Huang*, Shiqi Liu*, Kai Zhang, Ying Tai, Jian Yang, Hui Zeng, Lei Zhang
* Equal Contribution
• ICCV 2025
GitHub Repo stars