Isolater - Feed

Ax Weilin Wan, Jingtao Han, Weizhong Zhang, Cheng Jin 3/24/2026

Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization

Scaling laws for Mixture-of-Experts architecture design balancing global interactions and MoE-specific variables in LLMs.

Ax Xinyu Lu, Kaiqi Zhang, Jinglin Yang, Boxi Cao, Yaojie Lu, Hongyu Lin, Min He, Xianpei Han, Le Sun 3/24/2026

P^2O: Joint Policy and Prompt Optimization

Joint optimization of RL policies and LLM prompts for improving reasoning with verifiable rewards on hard samples.

Ax Nikolas Stavrou, Siamak Mehrkanoon 3/24/2026

SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting

Parameter-efficient vector-quantized UNet variant for weather precipitation nowcasting with reduced computational requirements.

Ax Ziyang Zhang, Zheshun Wu, Jie Liu, Luca Mottola 3/24/2026

SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

Energy optimization technique for edge device inference using fine-grained DVFS scaling aware of network sparsity.

Ax Juan Sebastian Rojas, Chi-Guhn Lee 3/24/2026

Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors

Analysis of temporal difference error interpretations in deep reinforcement learning and impact on critic loss formulation.

Ax Xixi Wu, Qianguo Sun, Ruiyang Zhang, Chao Song, Junlong Wu, Yiyan Qi, Hong Cheng 3/24/2026

Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe

Systematic empirical study on scaling RL for autonomous LLM agents with long-horizon tool orchestration using TravelPlanner benchmark.

Ax Ehimare Okoyomon, Christoph Goebel 3/24/2026

BOOST-RPF: Boosted Sequential Trees for Radial Power Flow

Gradient-boosted decision trees method for power flow analysis in distribution systems using sequential path-based learning.

Ax Dilina Rajapakse, Juan C. Rosero, Ivana Dusparic 3/24/2026

TREX: Trajectory Explanations for Multi-Objective Reinforcement Learning

Framework for explaining trajectories in multi-objective reinforcement learning agents handling conflicting objectives.

Ax Cristian P\'erez-Corral, Alberto Fern\'andez-Hern\'andez, Jose I. Mestre, Manuel F. Dolz, Enrique S. Quintana-Ort\'i 3/24/2026

{\lambda}-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks

Learning-based approach to parameterize GELU activation functions for converting smooth networks to piecewise-linear ReLU equivalents.

Ax Paolo Toccaceli 3/24/2026

CRPS-Optimal Binning for Conformal Regression

Non-parametric conformal regression method using binning optimization with CRPS metric for conditional distribution estimation.

Ax Xinyan Wang, Xiaogeng Liu, Chaowei Xiao 3/24/2026

ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention

Method to reduce overthinking in Large Reasoning Models by detecting and stopping redundant reasoning steps, lowering latency and compute costs.

Ax Peter Pak, Amir Barati Farimani 3/24/2026

AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing

AdditiveLLM2 domain-adapted multimodal LLM based on Gemma 3 for additive manufacturing using instruction tuning on domain corpus.

Ax Tianxiang Xu, Xiaoyan Zhu, Xin Lai, Sizhe Dang, Xin Lian, Hangyu Cheng, Jiayin Wang 3/24/2026

Do Papers Match Code? A Benchmark and Framework for Paper-Code Consistency Detection in Bioinformatics Software

Framework and benchmark for detecting inconsistencies between research papers and their implementations in bioinformatics software.

Ax Julius Kobialka, Emanuel Sommer, Chris Kolb, Juntae Kwon, Daniel Dold, David R\"ugamer 3/24/2026

On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

Analysis of how overparametrization and priors interact in Bayesian neural network posteriors and their effects on inference.

Ax Valentin Petrov 3/24/2026

On the Failure of Topic-Matched Contrast Baselines in Multi-Directional Refusal Abliteration

Study on why topic-matched contrast baselines fail in directional refusal abliteration for removing safety behaviors from LLMs.

Ax Aurora Esteban, Amelia Zafra, Sebasti\'an Ventura 3/24/2026

MIHT: A Hoeffding Tree for Time Series Classification using Multiple Instance Learning

MIHT algorithm for time series classification using multi-instance learning on variable-length and high-dimensional temporal data.

Ax Kexin Huang, Haoming Meng, Junkang Wu, Jinda Lu, Chiyu Ma, Ziqian Chen, Xue Wang, Bolin Ding, Jiancan Wu, Xiang Wang, Xiangnan He, Guoyin Wang, Jingren Zhou 3/24/2026

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Analysis of reinforcement learning with verifiable rewards for LLM reasoning, focusing on direction rather than magnitude of weight updates.

Ax Shreeram Murali, Cristian R. Rojas, Dominik Baumann 3/24/2026

Computationally lightweight classifiers with frequentist bounds on predictions

Computationally efficient classifier with frequentist uncertainty bounds suitable for safety-critical applications.

Ax Alois Bachmann 3/24/2026

dynActivation: A Trainable Activation Family for Adaptive Nonlinearity

Trainable activation function family (dynActivation) providing adaptive nonlinearity for vision and language modeling tasks.

Ax Abolfazl Hashemi 3/24/2026

RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation

RAMPAGE algorithm addressing discretization bias in extragradient methods for variational inequalities with variance reduction.

Ax Moritz G\"ogl, Christopher Yau 3/24/2026

Multimodal Survival Analysis with Locally Deployable Large Language Models

Multimodal survival analysis combining clinical text, tabular data, and genomics using locally deployable lightweight LLMs for privacy-constrained settings.

Ax Dharshan Kumaran, Nathaniel Daw, Simon Osindero, Petar Velickovic, Viorica Patraucean 3/24/2026

Causal Evidence that Language Models use Confidence to Drive Behavior

Causal investigation of whether LLMs use internal confidence estimates to regulate behavior through abstention paradigm experiments.

Ax Yurong Chen, Zhiyi Huang, Michael I. Jordan, Haipeng Luo 3/24/2026

Calibeating Made Simple

Theoretical framework reducing calibration of forecasts to online learning techniques with results for general proper losses.

Ax Oscar Novo, Oscar Bastidas-Jossa, Alberto Calvo, Antonio Peris, Carlos Kuchkovsky 3/24/2026

Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Study on incorporating domain knowledge into LLM-based code generation for quantum software development while maintaining maintainability.

Ax Kangqi Ni, Wenyue Hua, Xiaoxiang Shi, Jiang Guo, Shiyu Chang, Tianlong Chen 3/24/2026

Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Chimera serving system for multi-agent LLM workflows optimizing latency and performance on heterogeneous model deployments.

Ax Kexian Tang, Jiani Wang, Shaowen Wang, Kaifeng Lyu 3/24/2026

SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection

SPA baseline method using prompt engineering to generate synthetic data for knowledge injection into LLMs in specialized domains.

Ax Qilin Wang 3/24/2026

Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting

Benchmarking methodology for probabilistic time series forecasting using noise titration to test model robustness to non-stationarity.

Ax Changxiao Cai, Gen Li 3/24/2026

Confidence-Based Decoding is Provably Efficient for Diffusion Language Models

Decoding strategy analysis for diffusion language models showing confidence-based decoding is provably efficient for parallel token generation.

Ax Zakaria Mhammedi, James Cohan 3/24/2026

Decoupling Exploration and Policy Optimization: Uncertainty Guided Tree Search for Hard Exploration

Reinforcement learning approach decoupling exploration and policy optimization using uncertainty-guided tree search for autonomous agent exploration.

Ax Alexandra Zelenin, Alexandra Zhuravlyova 3/24/2026

Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels

DoRA scaling improvements using factored norms and fused kernels to reduce memory overhead in weight-decomposed low-rank adaptation for LLMs.

Ax Zizhe Zhang, Yicong Wang, Zhiquan Zhang, Tianyu Li, Nadia Figueroa 3/24/2026

Viability-Preserving Passive Torque Control

Off-topic: addresses passive torque control for robotic manipulators using viability theory for collision avoidance.

Ax Yunbei Zhang, Yingqiang Ge, Weijie Xu, Yuhui Xu, Jihun Hamm, Chandan K. Reddy 3/24/2026

Visual Exclusivity Attacks: Automatic Multimodal Red Teaming via Agentic Planning

Introduces visual exclusivity attacks for multimodal models where harm emerges through visual content reasoning, exploited via agentic planning for red teaming.

Ax Jiayun Wu, Peixu Hou, Shan Qu, Peng Zhang, Ning Gu, Tun Lu 3/24/2026

Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

Proposes fast-slow thinking reward models combining scalar and generative reward models for efficient RLHF alignment with improved accuracy over scalar-only approaches.

Ax Jiaqi Yuan, Jialu Wang, Zihan Wang, Qingyun Sun, Ruijie Wang, Jianxin Li 3/24/2026

AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

Presents AgenticGEO, a self-evolving agentic system for generative engine optimization that dynamically adapts content strategies to improve visibility in LLM-based search.

Ax Hongduan Tian, Xiao Feng, Ziyuan Zhao, Xiangyu Zhu, Rolan Yan, Bo Han 3/24/2026

Multi-Agent Debate with Memory Masking

Proposes multi-agent debate with memory masking for LLM reasoning, where multiple agents debate solutions across rounds with selective memory management.

Ax Michael Hersche, Nicolas Menet, Ronan Tanios, Abbas Rahimi 3/24/2026

Locally Coherent Parallel Decoding in Diffusion Language Models

Introduces locally coherent parallel decoding for diffusion language models to capture token dependencies while achieving sub-linear generation latency.

Ax Kenan Hasanaliyev, Silas Alberti, Jenny Hamer, Dheeraj Rajagopal, Kevin Robinson, Jasper Snoek, Victor Veitch, Alexander Nicholas D'Amour 3/24/2026

Expected Reward Prediction, with Applications to Model Routing

Investigates predicting expected reward scores from reward models to route prompts to suitable LLMs before generation, enabling intelligent model selection.

Ax Samuel Cestola, Tianxiang Xia, Zheng Weiyan, Zheng Pengfei, Diego Didona 3/24/2026

An experimental study of KV cache reuse strategies in chunk-level caching systems

Studies KV cache reuse strategies in chunk-level caching for retrieval-augmented generation, analyzing accuracy improvements when precomputing caches for retrieved text chunks.

Ax Lorenzo Noci, Gregor Bachmann, Seyed-Mohsen Moosavi-Dezfooli, Moin Nabi 3/24/2026

Thinking into the Future: Latent Lookahead Training for Transformers

Proposes latent lookahead training for transformers to enable multiple token exploration per step, addressing limitations of standard next-token prediction in autoregressive language models.

Ax Kushal Khemani 3/24/2026

Inference Energy and Latency in AI-Mediated Education: A Learning-per-Watt Analysis of Edge and Cloud Models

Compares latency and energy costs of edge vs cloud inference for AI tutoring using quantized Phi-3 models, analyzing learning-per-watt efficiency.

Ax Ryan Cory-Wright, Jean Pauphilet 3/24/2026

Compact Lifted Relaxations for Low-Rank Optimization

Convex relaxations for rank-constrained quadratic optimization without spectral structure requirements using lifted semidefinite programming.

Ax Ahmed Abouelazm, Jonas Michel, Daniel Bogdoll, Philip Sch\"orner, J. Marius Z\"ollner 3/24/2026

Beyond Scalar Rewards: Distributional Reinforcement Learning with Preordered Objectives for Safe and Reliable Autonomous Driving

Preordered Multi-Objective MDP for autonomous driving balancing safety, efficiency, and comfort via distributional reinforcement learning with safety constraints.

Ax Chen Xiong, Ziwen Wang, Deqi Wang, Cheng Wang, Yiyang Chen, He Zhang, Chao Gou 3/24/2026