Ax Md Atik Ahamed, Mihir Parmar, Palash Goyal, Yiwen Song, Long T. Le, Qiang Cheng, Chun-Liang Li, Hamid Palangi, Jinsung Yoon, Tomas Pfister 4/8/2026

TFRBench: A Reasoning Benchmark for Evaluating Forecasting Systems

TFRBench: Benchmark for evaluating reasoning capabilities of time-series forecasting systems beyond numerical accuracy metrics.

Ax Myeongsoo Kim, Joe Hsu, Dingmin Wang, Shweta Garg, Varun Kumar, Murali Krishna Ramanathan 4/8/2026

CODESTRUCT: Code Agents over Structured Action Spaces

CODESTRUCT: LLM-based code agents using structured AST action spaces instead of text matching for reliable code editing and repository interaction.

Ax Yi Nian, Aojie Yuan, Haiyue Zhang, Jiate Li, Yue Zhao 4/8/2026

Auditable Agents

Auditable Agents framework establishing accountability, auditability, and auditing definitions for LLM agents with external effects, addressing post-deployment answer-ability.

Ax Yushuo Zheng (Affiliation 1, Affiliation 2), Huiyu Duan (Affiliation 1), Zicheng Zhang (Affiliation 1, Affiliation 2), Yucheng Zhu (Affiliation 1), Xiongkuo Min (Affiliation 1), Guangtao Zhai (Affiliation 1, Affiliation 2) 4/8/2026

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Market-Bench comprehensive benchmark evaluating LLM capabilities in economically-relevant tasks via configurable multi-agent supply chain model with LLM retailer agents.

Ax Chenghao Li, Jun Liu, Songbo Zhang, Huadong Jian, Hao Ni, Lik-Hang Lee, Sung-Ho Bae, Guoqing Wang, Yang Yang, Chaoning Zhang 4/8/2026

Experience Transfer for Multimodal LLM Agents in Minecraft Game

Echo memory framework for multimodal LLM agents enabling transfer of reusable knowledge across Minecraft tasks by decomposing experience into five interpretable dimensions.

Ax Florent Capelli, YooJung Choi, Stefan Mengel, Mart\'in Mu\~noz, Guy Van den Broeck 4/8/2026

A canonical generalization of OBDD

Introduces Tree Decision Diagrams generalizing OBDD for Boolean function representation with improved succinctness and tractable operations like model counting and conditioning.

Ax Maria Nesterova, Mikhail Kolosov, Anton Andreychuk, Egor Cherepanov, Oleg Bulichev, Alexey Kovalev, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik 4/8/2026

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Foundation model enabling single GPT-based agent to perform across diverse multi-agent reinforcement learning tasks and environments.

Ax Eranga Bandara, Ross Gore, Sachin Shetty, Piumi Siyambalapitiya, Sachini Rajapakse, Isurunima Kularathna, Pramoda Karunarathna, Ravi Mukkamala, Peter Foytik, Safdar H. Bouk, Abdul Rahman, Xueping Liang, Amin Hass, Tharaka Hewa, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, Nilaan Loganathan 4/8/2026

Flowr -- Scaling Up Retail Supply Chain Operations Through Agentic AI in Large Scale Supermarket Chains

AI agents for retail supply chain operations, automating demand forecasting, procurement, and inventory replenishment in supermarket chains.