Isolater - Feed

Ax Connor Douglas, Joel Persson, Foster Provost 5/15/2026

Logging Policy Design for Off-Policy Evaluation

Studies optimal logging policy design to minimize off-policy evaluation error for treatment policy assessment.

Ax Lanxin Xiang, Liang Shi, Youhui Ye, Boyu Jiang, Dawei Zhou, Feng Guo 5/15/2026

RoSHAP: A Distributional Framework and Robust Metric for Stable Feature Attribution

RoSHAP distributional framework for stable and robust feature attribution analysis in ML models accounting for stochastic variation.

Ax Ryan Wei Heng Quek, Sanghyuk Lee, Alfred Wei Lun Leong, Arun Verma, Alok Prakash, Nancy F. Chen, Bryan Kian Hsiang Low, Daniela Rus, Armando Solar-Lezama 5/15/2026

MeMo: Memory as a Model

MeMo framework encodes new knowledge into a dedicated memory model while keeping the LLM frozen for efficient knowledge updates.

Ax Ruijia Niu, Dongxia Wu, Rose Yu, Yi-An Ma 5/15/2026

Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs

Functional-level uncertainty quantification method for calibrated fine-tuning of LLMs using parameter-efficient adapters.

Ax Alexis Bose, Jonathan Ethier, Paul Guinand 5/15/2026

Conformal Prediction for Multimodal Regression

Extends conformal prediction to multimodal regression using internal neural network features from images and text.

Ax Zhiliang Chen, Gregory Kang Ruey Lau, Chuan-Sheng Foo, Bryan Kian Hsiang Low 5/15/2026

DUET: Optimizing Training Data Mixtures via Feedback from Unseen Evaluation Tasks

DUET method optimizes LLM training data mixtures by learning from feedback on unseen evaluation tasks without access to task data.

Ax Zhiqiang He, Zhi Liu 5/15/2026

Silent Neuron Theory and Plasticity Preservation for Deep Reinforcement Learning in Adaptive Video Streaming

Silent neuron theory and plasticity preservation for deep RL in adaptive video streaming with network heterogeneity.

Ax Michael Theologitis, Vasilis Samoladas, Antonios Deligiannakis 5/15/2026

Communication-Efficient Federated Fine-Tuning

Communication-efficient federated fine-tuning of language models with parameter compression for distributed learning scenarios.

Ax Eli Chien, Wei-Ning Chen, Pan Li 5/15/2026

Privacy Amplification in Differentially Private Zeroth-Order Optimization with Hidden States

Privacy amplification analysis for zeroth-order optimization in differentially private fine-tuning of large language models.

Ax Yunpeng Qing, Yixiao Chi, Shuo Chen, Shunyu Liu, Kexuan Zhou, Sixu Lin, Litao Liu, Changqing Zou 5/15/2026

BiTrajDiff: Bidirectional Trajectory Generation with Diffusion Models for Offline Reinforcement Learning

BiTrajDiff uses bidirectional diffusion for trajectory generation in offline RL, addressing distribution bias through data augmentation.

Ax Kaiwen Chen, Xin Tan, Minchen Yu, Jingzong Li, Hong Xu 5/15/2026

ReasonCache: Accelerating Large Reasoning Model Serving through KV Cache Sharing

ReasonCache system accelerating large reasoning model serving through KV cache sharing among concurrent requests.

Ax Xinting Huang, Michael Hahn 5/15/2026

Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning

Unsupervised learning approach for decomposing neural model representation spaces into interpretable subspaces.

Ax Joon-Hyun Park, Mujin Cheon, Jeongsu Wi, Dong-Yeun Koh 5/15/2026

BOOST: A Data-Driven Framework for the Automated Joint Selection of Kernel and Acquisition Functions in Bayesian Optimization

BOOST framework for automated joint selection of kernel and acquisition functions in Bayesian optimization.

Ax Kun Feng, Shaocheng Lan, Yuchen Fang, Wenchao He, Sihan Lu, Shuqi Gu, Lintao Ma, Xingyu Lu, Kan Ren 5/15/2026

Kairos: Toward Adaptive and Parameter-Efficient Time Series Foundation Models

Kairos framework for adaptive time series foundation models addressing temporal heterogeneity with parameter efficiency.

Ax Alex Chen, Renato Geh, Aditya Grover, Guy Van den Broeck, Daniel Israel 5/15/2026

The Pitfalls of KV Cache Compression

Study identifying pitfalls in KV cache compression for LLMs in realistic multi-instruction scenarios with practical implications.

Ax Yihong Wu, Liheng Ma, Lei Ding, Muzhi Li, Xinyu Wang, Kejia Chen, Zhan Su, Zhanguang Zhang, Chenyang Huang, Yingxue Zhang, Mark Coates, Jian-Yun Nie 5/15/2026

It Takes Two: Your GRPO Is Secretly DPO

Analysis showing GRPO reinforcement learning algorithm for LLM post-training is equivalent to DPO with group-level baselines.

Ax Ning Yang, Hengyu Zhong, Haijun Zhang, Randall Berry 5/15/2026

Vision-LLMs for Spatiotemporal Traffic Forecasting

Vision-LLM approach for spatiotemporal traffic forecasting combining visual understanding of grid-based traffic data with language model capabilities.

Ax Donghyeok Shin, Yeongmin Kim, Suhyeon Jo, Byeonghu Na, Il-Chul Moon 5/15/2026

AMiD: Knowledge Distillation for LLMs with $\alpha$-mixture Assistant Distribution

AMiD knowledge distillation method for LLMs using alpha-mixture assistant distribution to address capacity gaps and training instability in student-teacher alignment.

Ax Yilang Zhang, Xiaodong Yang, Yiwei Cai, Georgios B. Giannakis 5/15/2026

ScaLoRA: Optimally Scaled Low-Rank Adaptation for Efficient High-Rank Fine-Tuning

ScaLoRA progressively accumulates high-rank weight updates from low-rank factors for more effective and faster LLM fine-tuning than standard LoRA.

Ax Zhichao Wang 5/15/2026

GIFT: Group-Relative Implicit Fine-Tuning Integrates GRPO with DPO and UNA

GIFT combines GRPO group sampling, DPO-style implicit rewards, and UNA advantage standardization for on-policy LLM fine-tuning with improved efficiency.

Ax Ildus Sadrtdinov, Ekaterina Lobacheva, Ivan Klimov, Mikhail Burtsev, Mikhail I. Katsnelson, Dmitry Vetrov 5/15/2026

Can Stationary Distributions of Scale-Invariant Neural Networks Be Described by the Thermodynamics of an Ideal Gas?

Develops thermodynamic framework describing stationary distributions of SGD with weight decay for scale-invariant neural networks.

Ax Kairong Luo, Zhenbo Sun, Haodong Wen, Xinyu Shi, Jiarui Cui, Chenyi Dang, Kaifeng Lyu, Wenguang Chen 5/15/2026

How Learning Rate Decay Wastes Your Best Data in Curriculum-Based LLM Pretraining

Analyzes how learning rate decay reduces effective use of high-quality data in curriculum-based LLM pretraining, proposing improved curriculum strategies.

Ax Robert Joseph George, Carson Eisenach, Udaya Ghai, Dominique Perrault-Joncas, Anima Anandkumar, Dean Foster 5/15/2026

BRIDGE: Building Representations In Domain Guided Program Synthesis

BRIDGE framework for structured prompting of LLMs to generate code with formal verification in proof assistants like Lean, handling multiple coupled domains.

Ax John C. Hill, Tyler LaBonte, Xinchen Zhang, Vidya Muthukumar 5/15/2026

On the Unreasonable Effectiveness of Last-layer Retraining

Investigates last-layer retraining to mitigate spurious correlations and improve minority group performance in neural networks trained with ERM.

Ax Nathan P. Lawrence, Ali Mesbah 5/15/2026

Why Goal-Conditioned Reinforcement Learning Works: Relation to Dual Control

Analyzes goal-conditioned reinforcement learning through optimal control framework, deriving optimality gaps between goal-conditioned and dense reward objectives.

Ax Behrooz Tahmasebi, Melanie Weber 5/15/2026

Achieving Approximate Symmetry Is Exponentially Easier than Exact Symmetry

Theoretical analysis showing approximate symmetry is exponentially easier to enforce than exact symmetry in ML models, with implications for inductive biases.

Ax Dung Anh Hoang, Cuong Pham, Cuong Nguyen, Trung le, Jianfei Cai, Thanh-Toan Do 5/15/2026

Rethinking Output Alignment For 1-bit Post-Training Quantization of Large Language Models

Rethinking output alignment for 1-bit post-training quantization of LLMs to enable efficient deployment on resource-constrained devices.

Ax Yuanning Cui, Zequn Sun, Wei Hu, Kexuan Xin, Zhangjie Fu 5/15/2026

Breaking the Reasoning Horizon in Entity Alignment Foundation Models

Addresses entity alignment in knowledge graphs using graph foundation models to capture long-range dependencies across sparse KG structures.

Ax Minghao Yang, Ren Togo, Guang Li, Takahiro Ogawa, Miki Haseyama 5/15/2026

L2R: Low-Rank and Lipschitz-Controlled Routing for Mixture-of-Experts

Proposes L2R, a routing mechanism for Mixture-of-Experts models using low-rank projections and Lipschitz control to improve expert specialization and routing discriminability.

Ax Tianqi Zhao, Guanyang Wang, Yan Shuo Tan, Qiong Zhang 5/15/2026

TabClustPFN: A Prior-Fitted Network for Tabular Data Clustering

TabClustPFN extends prior-fitted networks paradigm to unsupervised tabular clustering, amortizing Bayesian inference for heterogeneous tabular data.

Ax Shuangqi Li, Hieu Le, Jingyi Xu, Mathieu Salzmann 5/15/2026

LoRIF: Low-Rank Influence Functions for Scalable Training Data Attribution

LoRIF: Low-rank approximation for training data attribution using influence functions, scaling to large datasets while maintaining attribution quality.

Ax Jinju Park, Seokho Kang 5/15/2026

PaAno: Patch-Based Representation Learning for Time-Series Anomaly Detection

PaAno: Patch-based representation learning for time-series anomaly detection that is computationally efficient compared to large foundation models.

Ax Dionisia Naddeo, Jonas Linkerh\"agner, Nicola Toschi, Geri Skenderi, Veronica Lachi 5/15/2026

Hyperbolic Graph Neural Networks Under the Microscope: The Role of Geometry-Task Alignment

Analysis of Hyperbolic Graph Neural Networks proposing geometry-task alignment criterion for effective hierarchical representation learning on tree-like graphs.

Ax Qihao Wen, Jiahao Wang, Yang Nan, Pengfei He, Ravi Tandon, Han Xu 5/15/2026

Embedding Perturbation may Better Reflect Intermediate-Step Uncertainty in LLM Reasoning

Embedding perturbation technique for uncertainty quantification in LLM reasoning tasks, measuring confidence in intermediate reasoning steps.

Ax Wenze Lin, Zhen Yang, Xitai Jiang, Xiaoteng Ma, Gao Huang 5/15/2026

Boosting LLM Reasoning via Human-Inspired Reward Shaping

Human-inspired reward shaping for LLM reasoning via reinforcement learning, separating exploration and consolidation phases for improved performance.

Ax Yinan Huang, Hans Hao-Hsun Hsu, Junran Wang, Bo Dai, Pan Li 5/15/2026

Accelerated Sequential Flow Matching: A Bayesian Filtering Perspective

Accelerated Sequential Flow Matching for real-time probabilistic inference on streaming observations using Bayesian filtering perspective on diffusion models.

Ax Jinzong Dong, Wei Huang, Jianshu Zhang, Zhuo Chen, Xinzhe Yuan, Qinying Gu, Zhaohui Jiang, Nanyang Ye 5/15/2026

Proximal Action Replacement for Behavior Cloning Actor-Critic in Offline Reinforcement Learning

Proximal Action Replacement method for offline reinforcement learning combining actor-critic and behavior cloning to mitigate suboptimal dataset actions.

Ax Jingkun Liu, Yisong Yue, Max Welling, Yue Song 5/15/2026

Krause Synchronization Transformers

Krause Attention mechanism addressing representation collapse and attention sink phenomena in transformers through principled bounded-confidence dynamics.

Ax Wenqian Chen, Yucheng Fu, Michael Penwarden, Pratanu Roy, Panos Stinis 5/15/2026

ArGEnT: Arbitrary Geometry-encoded Transformer for Operator Learning

ArGEnT transformer for operator learning on systems with complex, varying geometries for scientific machine learning applications like design optimization.

Ax Gabriel Franco, Lucas M. Tassis, Azalea Rohr, Mark Crovella 5/15/2026

Finding Interpretable Prompt-Specific Circuits in Language Models

ACC++ improves circuit-tracing method for mechanistic interpretability by identifying attention head signals in language models via low-dimensional subspaces.

Ax Aggelos Semoglou, John Pavlopoulos 5/15/2026

CAKE: Confidence in Assignments via K-partition Ensembles

CAKE method for assessing confidence in individual clustering assignments via k-partition ensembles, addressing instability in k-means algorithms.

Ax Pascal Jr Tikeng Notsawo, Guillaume Dumas, Guillaume Rabusseau 5/15/2026

Grokking Finite-Dimensional Algebra

Study of grokking phenomenon (sudden generalization) in neural networks learning finite-dimensional algebra operations, extending prior work on group operations.

Ax Ruijie Zhang, Yequan Zhao, Ziyue Liu, Zhengyang Wang, Yupeng Su, Liyan Tan, Zheng Zhang 5/15/2026

MUON+: Towards More Effective Muon via One Additional Normalization Step for LLM Pre-training

MUON+ improves the Muon optimizer for LLM pre-training by addressing norm imbalance issues in polar iterations through an additional normalization step.

Ax Tiantong Wang, Xinyu Yan, Tiantong Wu, Yurong Hao, Pengjun Xie, Wei Yang Bryan Lim 5/15/2026

MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

MPU framework for privacy-preserving machine unlearning in LLMs using perturbed copies without sharing parameters or forget sets.

Ax Abdulrahman Alswaidan, Jeffrey D. Varner 5/15/2026

Stochastic Attention via Langevin Dynamics on the Modern Hopfield Energy

Stochastic attention via Langevin dynamics on modern Hopfield energy enables temperature-controlled retrieval and generation.

Ax Th\'eo Vincent, Kevin Gerhardt, Yogesh Tripathi, Habib Maraqten, Adam White, Martha White, Jan Peters, Carlo D'Eramo 5/15/2026

Gradient Iterated Temporal-Difference Learning

Gradient Iterated Temporal-Difference Learning addresses divergence issues in TD learning with semi-gradient updates.

Ax Alliot Nagle, Jakhongir Saydaliev, Dhia Garbaya, Michael Gastpar, Ashok Vardhan Makkuva, Hyeji Kim 5/15/2026

TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning

TERMINATOR learns optimal early stopping points for chain-of-thought reasoning to reduce overthinking compute waste.

Ax Vishnu Teja Kunde, Fatemeh Doudi, Mahdi Farahbakhsh, Dileep Kalathil, Krishna Narayanan, Jean-Francois Chamberland 5/15/2026

Reinforcement Learning for Diffusion LLMs with Entropy-Guided Step Selection and Stepwise Advantages

RL framework for diffusion language models using entropy-guided step selection and stepwise advantages for training.

Ax Mayank Mishra, Shawn Tan, Ion Stoica, Joseph Gonzalez, Tri Dao 5/15/2026

M$^2$RNN: Non-Linear RNNs with Matrix-Valued States for Scalable Language Modeling

M²RNN proposes non-linear RNNs with matrix-valued states for language modeling with greater expressive power than Transformers.

Ax Anish Saha, Konstantin Shmakov 5/15/2026

A Foundation Model for Instruction-Conditioned In-Context Time Series Tasks

iAmTime foundation model for time series tasks using explicit instruction-conditioned in-context learning with demonstrations.