|
My research focuses on how to build AI agents that reshape human-AI interaction and how to deploy them safely at scale.
|
|
Agent
Generative Interfaces for Language Models
Jiaqi Chen*,
Yanzhe Zhang*,
Yutong Zhang,
Yijia Shao,
Diyi Yang
Preprint [code] [website]
|
|
Agent
Risk
Searching for Privacy Risks in LLM Agents via Simulation
Yanzhe Zhang,
Diyi Yang
ICLR, 2026 [code]
|
Evaluation
AutoMetrics: Approximate Human Judgements with Automatically Generated Evaluators
Michael J. Ryan,
Yanzhe Zhang,
Amol Salunkhe,
Yi Chu,
Di Xu,
Diyi Yang
ICLR, 2026 [code]
|
|
Agent
Real-Time Reasoning Agents in Evolving Environments
Yule Wen*,
Yixin Ye*,
Yanzhe Zhang,
Diyi Yang,
Hao Zhu
ICLR, 2026 [code]
|
|
Agent
Evaluation
Computer Agent Arena: Toward Human-Centric Evaluation and Analysis of Computer-Use Agents
Bowen Wang,
Xinyuan Wang,
Jiaqi Deng,
Tianbao Xie,
Ryan Li,
Yanzhe Zhang,
Junli Wang,
Dunjie Lu,
Zicheng Gong,
Gavin Li,
Toh Jing Hua,
Wei-Lin Chiang,
Ion Stoica,
Diyi Yang,
Yu Su,
Yi Zhang,
Zhiguo Wang,
Victor Zhong,
Tao Yu
ICLR, 2026 [code]
|
|
Agent
Training
SWE-smith: Scaling Data for Software Engineering Agents
John Yang,
Kilian Lieret,
Carlos E. Jimenez,
Alexander Wettig,
Kabir Khandpur,
Yanzhe Zhang,
Binyuan Hui,
Ofir Press,
Ludwig Schmidt,
Diyi Yang
NeurIPS Datasets & Benchmarks, 2025 [website]
|
|
Agent
Risk
Attacking Vision-Language Computer Agents via Pop-ups
Yanzhe Zhang,
Tao Yu,
Diyi Yang
ACL, 2025 [code]
|
|
Training
Distilling an End-to-End Voice Assistant from Speech Recognition Data
Will Held,
Yanzhe Zhang,
Ella Li,
Weiyan Shi,
Michael Ryan,
Diyi Yang
ACL, 2025 [website][training code][eval code]
|
|
Agent
Evaluation
Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping
Ryan Li,
Yanzhe Zhang,
Diyi Yang
NAACL, 2025 [website][code]
|
|
Agent
Evaluation
Design2Code: How Far Are We From Automating Front-End Engineering?
Chenglei Si*,
Yanzhe Zhang* ,
Ryan Li,
Zhengyuan Yang,
Ruibo Liu,
Diyi Yang
NAACL, 2025 [website][code][data]
|
|
Agent
Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Evaluation
Zijun Liu,
Yanzhe Zhang ,
Peng Li,
Yang Liu,
Diyi Yang
COLM, 2024 [code]
|
|
Risk
Auditing Gender Presentation Differences in Text-to-Image Models
Yanzhe Zhang ,
Lu Jiang,
Greg Turk,
Diyi Yang
EAAMO, 2024 [website][code][data]
|
|
Training
TRINS: Towards Multimodal Language Models that Can Read
Ruiyi Zhang,
Yanzhe Zhang,
Jian Chen,
Yufan Zhou,
Jiuxiang Gu,
Changyou Chen,
Tong Sun
CVPR, 2024
|
|
Training
Enhanced Visual Instruction Tuning for Text-rich Image Understanding
Yanzhe Zhang ,
Ruiyi Zhang,
Jiuxiang Gu,
Yufan Zhou,
Nedim Lipka,
Diyi Yang,
Tong Sun
NeurIPS Workshop on Instruction Tuning and Instruction Following, 2023 [website][code][data]
|
|
Training
Risk
Robustness of Demonstration-based Learning Under Limited Data Scenario
Hongxin Zhang,
Yanzhe Zhang ,
Ruiyi Zhang,
Diyi Yang
EMNLP, 2022 [code]
|
|
Training
Continual Sequence Generation with Adaptive Compositional Modules
Yanzhe Zhang ,
Xuezhi Wang,
Diyi Yang
ACL, 2022 [code]
|
|
Training
Continual Learning for Text Classification with Information Disentanglement Based Regularization
Yufan Huang*,
Yanzhe Zhang* ,
Jiaao Chen,
Xuezhi Wang,
Diyi Yang
NAACL, 2021 [code]
|
|
Service
Reviewer: ARR, ACL, NAACL, EMNLP, EACL, COLM, CoLLAs, ICLR, ICML.
|
|