Ax Minhyuk Seo, Seongwon Cho, Minjae Lee, Diganta Misra, Hyeonbeom Choi, Seon Joo Kim, Jonghyun Choi 4/1/2026

GenOL: Generating Diverse Examples for Name-only Online Learning

GenOL framework for online learning with only concept names (name-only setup) enabling real-time adaptation to data distribution shifts in continual learning scenarios.

Ax Zehua Pei, Ying Zhang, Hui-Ling Zhen, Tao Yuan, Xianzhi Yu, Zhenhua Dong, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu 4/1/2026

PreMoE: Proactive Inference for Efficient Mixture-of-Experts

Training-free framework for compiling sparse Mixture-of-Experts variants with predicted expert utility metric for deployment optimization.

Ax Ivan Y. Tyukin, Bogdan Grechuk, Evgeny M. Mirkes, Alexander N. Gorban 4/1/2026

When fractional quasi p-norms concentrate

Theoretical analysis of concentration properties for fractional quasi p-norms in high-dimensional spaces.

Ax Johannes Exenberger, Sascha Ranftl, Robert Peharz 4/1/2026

Deep Polynomial Chaos Expansion

Classical polynomial chaos expansion technique for surrogate modeling and uncertainty quantification in physical simulation.

Ax Da Chang, Yongxiang Liu, Ganzhao Yuan 4/1/2026

On the Convergence of Muon and Beyond

Theoretical convergence analysis of Muon optimizer for matrix-structured parameters in neural network training.

Ax Ahmed A. Elhag, Arun Raja, Alex Morehead, Samuel M. Blau, Hongtao Zhao, Christian Tyrchan, Eva Nittinger, Garrett M. Morris, Michael M. Bronstein 4/1/2026

Learning Inter-Atomic Potentials without Explicit Equivariance

Transformer-based inter-atomic potential model for molecular simulations without explicit equivariance constraints.

Ax Mingzhi Chen, Taiming Lu, Jiachen Zhu, Mingjie Sun, Zhuang Liu 4/1/2026

Stronger Normalization-Free Transformers

Research on normalization-free transformer architectures using Dynamic Tanh as alternative to standard normalization layers.

Ax Yuze Wang, Yujia Tong, Xuan Liu, Junhao Dong 4/1/2026

Sparsity-Aware Unlearning for Large Language Models

Addresses machine unlearning for sparse LLMs to remove memorized sensitive information while maintaining model sparsification benefits for efficient deployment.

Ax Jie Xiao, Meng Chen, Qingnan Ren, Jingwei Song, Jiaqi Huang, Yangshen Deng, Chris Tong, Wanyi Chen, Suli Wang, Ziqian Bi, Shuo Lu, Yiqun Duan, Xu Wang, Rymon Yu, Ween Yang, Lynn Ai, Eric Yang, Bill Shi 4/1/2026

ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning

ECHO-2 is distributed RL framework for LLM post-training via reinforcement learning, optimizing cost-efficiency of rollout generation across distributed resources.

Ax Amin Oji, Paul Fieguth 4/1/2026

Joint Embedding Variational Bayes

VJE introduces reconstruction-free latent-variable framework for self-supervised learning using symmetric conditional ELBO on paired embeddings.

Ax Luca Ghafourpour, Sinho Chewi, Alessio Figalli, Aram-Alexandre Pooladian 4/1/2026

Variational inference via radial transport

Proposes radVI algorithm for variational inference by optimizing radial profiles to better approximate high-dimensional distributions beyond standard Gaussian surrogates.