Ax Ruijiang Gao, Steven Chong Xiao 3/18/2026

Nonstandard Errors in AI Agents

Study of reproducibility in AI coding agents, showing agent-to-agent variation produces nonstandard errors in empirical results.

Ax Yongyuan Liang, Shijie Zhou, Yu Gu, Hao Tan, Gang Wu, Franck Dernoncourt, Jihyung Kil, Ryan A. Rossi, Ruiyi Zhang 3/18/2026

Anticipatory Planning for Multimodal AI Agents

Two-stage RL framework training multimodal agents for anticipatory reasoning and long-term planning in multi-step tasks.

Ax Rui Ge, Yichao Fu, Yuyang Qian, Junda Su, Yiming Zhao, Peng Zhao, Hao Zhang 3/18/2026

Internalizing Agency from Reflective Experience

Method for training LLM agents to leverage rich environment feedback through reflective experience and post-training, improving long-horizon planning.

Ax Kyle Dumont, Nicholas Herbert, Hayder Tirmazi, Shrikanth Upadhayaya 3/18/2026

DRCY: Agentic Hardware Design Reviews

AI agent system for hardware design reviews using LLMs to verify semantic correctness of component connections against datasheets.

Ax Yulin Peng, Haowen Hou, Xinxin Zhu, Ying Tiffany He, F. Richard Yu 3/18/2026

SEMAG: Self-Evolutionary Multi-Agent Code Generation

SEMAG: self-evolutionary multi-agent code generation framework that decomposes programming tasks into planning, coding, debugging stages with adaptive workflow selection.

Ax Mateusz Dziemian, Maxwell Lin, Xiaohan Fu, Micha Nowak, Nick Winter, Eliot Jones, Andy Zou, Lama Ahmad, Kamalika Chaudhuri, Sahana Chennabasappa, Xander Davies, Lauren Deason, Benjamin L. Edelman, Tanner Emek, Ivan Evtimov, Jim Gust, Maia Hamin, Kat He, Klaudia Krawiecka, Riccardo Patana, Neil Perry, Troy Peterson, Xiangyu Qi, Javier Rando, Zifan Wang, Zihan Wang, Spencer Whitman, Eric Winsor, Arman Zharmagambetov, Matt Fredrikson, Zico Kolter 3/18/2026

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Large-scale competition analysis revealing LLM agents' vulnerability to indirect prompt injection attacks through adversarial instructions in external content sources.

Ax MiroMind Team, S. Bai, L. Bing, L. Lei, R. Li, X. Li, X. Lin, E. Min, L. Su, B. Wang, L. Wang, L. Wang, S. Wang, X. Wang, Y. Zhang, Z. Zhang, G. Chen, L. Chen, Z. Cheng, Y. Deng, Z. Huang, D. Ng, J. Ni, Q. Ren, X. Tang, B. L. Wang, H. Wang, N. Wang, C. Wei, Q. Wu, J. Xia, Y. Xiao, H. Xu, X. Xu, C. Xue, Z. Yang, Z. Yang, F. Ye, H. Ye, J. Yu, C. Zhang, W. Zhang, H. Zhao, P. Zhu 3/18/2026

MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification

MiroThinker-1.7 and H1: research agents with enhanced verification and multi-step reasoning via structured planning and contextual reasoning for long-horizon tasks.

Ax Yihao Zhang, Zeming Wei, Xiaokun Luan, Chengcan Wu, Zhixin Zhang, Jiangrong Wu, Haolin Wu, Huanran Chen, Jun Sun, Meng Sun 3/18/2026

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems

ClawWorm: first documented self-propagating attack across LLM agent ecosystems, demonstrating security vulnerabilities in OpenClaw platform with 40,000+ active instances.