bofei.jpg

Bofei Zhang (张博飞)

Email: zhangbofei5675[at]outlook[dot]com

Experience

Career

Education

  • 2018/9-2020/6; Master in Data Science @ New York University
  • 2013/8-2018/5; Bachelor in Biomedical Engineering @ The Ohio State University

News

Apr 19, 2025 Release dataset, model and training & evaluation code for TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
Dynamic JSON Badge Dynamic JSON Badge Dynamic JSON Badge Dynamic JSON Badge
Feb 07, 2025 Multi-modal Agent Tuning (MAT): A framework for auto-generating multimodal tool-usage trajectories (20K MM-Traj), boosting MiniCPM & Qwen-VL tool use by 20%. This work is accepted by ICLR 2025!
Aug 02, 2024 Introducing 🔥FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models. Checkout Here for more details! 🔥FIRE are accepted by NeurIPS 2024!

Latest Posts

Selected Publications

* Equal contribution, ✉ Corresponding author

  1. cof.jpg
    Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL
    Xintong Zhang* , Zhi Gao* , Bofei Zhang, and 8 more authors
    Preprint, 2025
  2. sport.jpg
    Iterative Tool Usage Exploration for Multimodal Agents via Step-wise Preference Tuning
    Pengxiang Li* , Zhi Gao* , Bofei Zhang, and 8 more authors
    Preprint, 2025
  3. tongui.png
    TongUI: Building Generalized GUI Agents by Learning from Multimodal Web Tutorials
    Bofei Zhang* , Zirui Shang* , Zhi Gao*, and 7 more authors
    Preprint, 2025
  4. mat.png
    Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage Spotlight (Top 5%)
    Zhi Gao* , Bofei Zhang* , Pengxiang Li*, and 7 more authors
    International Conference on Learning Representations (ICLR), 2025
  5. 2024fire.png
    FIRE: A Dataset for Feedback Integration and Refinement Evaluation of Multimodal Models
    Pengxiang Li* , Zhi Gao* , Bofei Zhang*, and 6 more authors
    Neural Information Processing Systems: Datasets and Benchmarks (NeurIPS D&B), 2024