News

Oct 03, 2025 New preprint: Efficient Multi-turn RL for GUI Agents via Decoupled Training and Adaptive Data Curation (DART-GUI).
If you find this research helpful, consider starring the GitHub repo and upvoting the HF paper page.
Sep 26, 2025 Our paper “Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning” has been accepted by NeurIPS 2025! 🎉
Apr 19, 2025 Release dataset, model and training & evaluation code for TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
Dynamic JSON Badge Dynamic JSON Badge Dynamic JSON Badge Dynamic JSON Badge
Feb 07, 2025 Multi-modal Agent Tuning (MAT): A framework for auto-generating multimodal tool-usage trajectories (20K MM-Traj), boosting MiniCPM & Qwen-VL tool use by 20%. This work is accepted by ICLR 2025!
Aug 02, 2024 Introducing 🔥FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models. Checkout Here for more details! 🔥FIRE are accepted by NeurIPS 2024!