Ax Chenxi Wang, Zhuoyun Yu, Xin Xie, Wuguannan Yao, Runnan Fang, Shuofei Qiao, Kexin Cao, Guozhou Zheng, Xiang Qi, Peng Zhang, Shumin Deng 4/7/2026

SkillX: Automatically Constructing Skill Knowledge Bases for Agents

SkillX framework automatically constructs reusable skill knowledge bases for LLM agents, enabling efficient learning and generalization across tasks.

Ax LM-Provers, Yuxiao Qu, Amrith Setlur, Jasper Dekoninck, Edward Beeching, Jia Li, Ian Wu, Lewis Tunstall, Aviral Kumar 4/7/2026

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

QED-Nano trains small neural networks to prove mathematical theorems, enabling reproducible and efficient theorem-proving without large models.

Ax Yikun Ban, Yuchen Yan, Arindam Banerjee, Jingrui He 4/7/2026

Neural Exploitation and Exploration of Contextual Bandits

Neural networks applied to contextual multi-armed bandits, comparing epsilon-greedy, Thompson Sampling, and UCB techniques for exploration-exploitation trade-offs.

Ax Ricardo Gama, Ricardo Cunha, Daniel Fuertes, Carlos R. del-Blanco, Hugo L. Fernandes 4/7/2026

Multi-Agent Environments for Vehicle Routing Problems

Open-source RL framework for vehicle routing problems, extending reinforcement learning to discrete optimization in operations research.

Ax Fengqing Jiang, Fengbo Ma, Zhangchen Xu, Yuetai Li, Zixin Rao, Bhaskar Ramasubramanian, Luyao Niu, Bo Li, Xianyan Chen, Zhen Xiang, Radha Poovendran 4/7/2026

SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

SoSBench benchmarks safety alignment of LLMs across six scientific domains with sophisticated risks beyond basic misuse scenarios.

Ax Patrick Vossler, Fan Xia, Yifan Mai, Adarsh Subbaswamy, Jean Feng 4/7/2026

LLMs Judging LLMs: A Simplex Perspective

Studies the problem of using LLMs as judges for evaluating LLM outputs, addressing epistemic uncertainty in judge quality beyond sampling variability.