Ax Min Sun (F. Hoffmann-La Roche AG, Roche Pharma Research and Early Development), Federica Storti (F. Hoffmann-La Roche AG, Roche Pharma Research and Early Development), Valentina Martino (F. Hoffmann-La Roche AG, Roche Pharma Research and Early Development), Miguel Gonzalez-Andrades (F. Hoffmann-La Roche AG, Roche Pharma Research and Early Development), Tony Kam-Thong (F. Hoffmann-La Roche AG, Roche Pharma Research and Early Development) 4/8/2026

Algebraic Structure Discovery for Real World Combinatorial Optimisation Problems: A General Framework from Abstract Algebra to Quotient Space Learning

Framework identifies algebraic structures in combinatorial optimization problems, constructs quotient spaces to reduce search space and improve solution quality.

Ax Andrew Sellergren, Chufan Gao, Fereshteh Mahvar, Timo Kohlberger, Fayaz Jamil, Madeleine Traverse, Alberto Tono, Bashir Sadjad, Lin Yang, Charles Lau, Liron Yatziv, Tiffany Chen, Bram Sterling, Kenneth Philbrick, Richa Tiwari, Yun Liu, Madhuram Jajoo, Chandrashekar Sankarapu, Swapnil Vispute, Harshad Purandare, Abhishek Bijay Mishra, Sam Schmidgall, Tao Tu, Anil Palepu, Chunjong Park, Tim Strother, Rahul Thapa, Yong Cheng, Preeti Singh, Kat Black, Yossi Matias, Katherine Chou, Avinatan Hassidim, Kavi Goel, Joelle Barral, Tris Warkentin, Shravya Shetty, Dale Webster, Sunny Virmani, David F. Steiner, Can Kirmizibayrak, Daniel Golden 4/8/2026

MedGemma 1.5 Technical Report

MedGemma 1.5 4B model expands medical capabilities with high-dimensional imaging (CT/MRI/histopathology), anatomical localization, and improved document understanding.

Ax Xiangyi Li, Kyoung Whan Choe, Yimin Liu, Xiaokun Chen, Chujun Tao, Bingran You, Wenbo Chen, Zonglin Di, Jiankai Sun, Shenghan Zheng, Jiajun Bao, Yuanli Wang, Weixiang Yan, Yiyuan Li, Han-chung Lee 4/8/2026

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

ClawsBench benchmark evaluates LLM agents on realistic productivity tasks (email, scheduling, documents) in simulated multi-service environments with stateful workflows.

Ax Eliza Berman, Bella Chang, Daniel B. Neill, Emily Black 4/8/2026

Attribution Bias in Large Language Models

AttriBench: Demographically-balanced benchmark for measuring attribution bias in LLMs when attributing quotes to original authors.

Ax Hangoo Kang, Tarun Suresh, Jon Saad-Falcon, Azalia Mirhoseini 4/8/2026

TRACE: Capability-Targeted Agentic Training

TRACE: Framework for targeted training of LLM agents on capability gaps identified in specific environments and task distributions.

Ax Md Atik Ahamed, Mihir Parmar, Palash Goyal, Yiwen Song, Long T. Le, Qiang Cheng, Chun-Liang Li, Hamid Palangi, Jinsung Yoon, Tomas Pfister 4/8/2026

TFRBench: A Reasoning Benchmark for Evaluating Forecasting Systems

TFRBench: Benchmark for evaluating reasoning capabilities of time-series forecasting systems beyond numerical accuracy metrics.

Ax Myeongsoo Kim, Joe Hsu, Dingmin Wang, Shweta Garg, Varun Kumar, Murali Krishna Ramanathan 4/8/2026

CODESTRUCT: Code Agents over Structured Action Spaces

CODESTRUCT: LLM-based code agents using structured AST action spaces instead of text matching for reliable code editing and repository interaction.

Ax Yi Nian, Aojie Yuan, Haiyue Zhang, Jiate Li, Yue Zhao 4/8/2026

Auditable Agents

Auditable Agents framework establishing accountability, auditability, and auditing definitions for LLM agents with external effects, addressing post-deployment answer-ability.

Ax Yushuo Zheng (Affiliation 1, Affiliation 2), Huiyu Duan (Affiliation 1), Zicheng Zhang (Affiliation 1, Affiliation 2), Yucheng Zhu (Affiliation 1), Xiongkuo Min (Affiliation 1), Guangtao Zhai (Affiliation 1, Affiliation 2) 4/8/2026

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Market-Bench comprehensive benchmark evaluating LLM capabilities in economically-relevant tasks via configurable multi-agent supply chain model with LLM retailer agents.

Ax Chenghao Li, Jun Liu, Songbo Zhang, Huadong Jian, Hao Ni, Lik-Hang Lee, Sung-Ho Bae, Guoqing Wang, Yang Yang, Chaoning Zhang 4/8/2026

Experience Transfer for Multimodal LLM Agents in Minecraft Game

Echo memory framework for multimodal LLM agents enabling transfer of reusable knowledge across Minecraft tasks by decomposing experience into five interpretable dimensions.

Ax Florent Capelli, YooJung Choi, Stefan Mengel, Mart\'in Mu\~noz, Guy Van den Broeck 4/8/2026

A canonical generalization of OBDD

Introduces Tree Decision Diagrams generalizing OBDD for Boolean function representation with improved succinctness and tractable operations like model counting and conditioning.