Isolater - Feed

Ax Egor Denisov, Svetlana Glazyrina, Maksim Kryzhanovskiy, Roman Ischenko 3/26/2026

Smooth Gate Functions for Soft Advantage Policy Optimization

Smooth gate functions for stabilizing GRPO LLM training. Replaces hard clipping with sigmoid-based gating to improve optimization stability in reasoning tasks.

Ax Xiang Li, Yuheng Zhang, Nan Jiang 3/26/2026

Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parametric Policies

Theoretical analysis of offline reinforcement learning with general function approximation and parametric policies, extending beyond finite action spaces.

Ax Andrew Chin, Dongkwan Kim, Yu-Fu Fu, Fabian Fleischer, Youngjoon Kim, HyungSeok Han, Cen Zhang, Brian Junekyu Lee, Hanqing Zhao, Taesoo Kim 3/26/2026

OSS-CRS: Liberating AIxCC Cyber Reasoning Systems for Real-World Open-Source Security

Open-source framework for deploying DARPA AIxCC cyber reasoning systems locally. Makes competition CRSs usable outside original infrastructure with improved accessibility.

Ax Anupam Purwar, Aditya Choudhary 3/26/2026

MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

Evaluation framework for persona-adaptive LLM-powered agents in multi-modal settings, addressing user-aware behavior in customer experience management.

Ax Edward Y. Chang 3/26/2026

Exploring Collatz Dynamics with Human-LLM Collaboration

Mathematical analysis of Collatz conjecture dynamics using modular arithmetic and combinatorial methods. Pure mathematics research unrelated to AI/ML.

Ax Siddharth Srikanth, Freddie Liang, Ya-Chuan Hsu, Varun Bhatt, Shihan Zhao, Henry Chen, Bryon Tjanaka, Minjune Hwang, Akanksha Saran, Daniel Seita, Aaquib Tabrez, Stefanos Nikolaidis 3/26/2026

Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

Mathematical analysis of Collatz conjecture dynamics using modular arithmetic and combinatorial methods with LLM assistance.

Ax Zekun Wu, Adriano Koshiyama, Sahan Bulathwela, Maria Perez-Ortiz 3/26/2026

AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents

Red-teaming Vision-Language-Action models through quality diversity prompt generation to improve robot policy robustness.

Ax Haoan Feng, Sri Harsha Musunuri, Guan-Ming Su 3/26/2026

Geometry-Guided Camera Motion Understanding in VideoLLMs

AgentDrift: reveals safety risks in LLM agent recommendations when tools are corrupted, hidden by standard metrics.

Ax Seokmin Lee, Yunghee Lee, Byeonghyun Pak, Byeongju Woo 3/26/2026

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Framework for improving VideoLLM understanding of camera motion through benchmarking, diagnosis, and explicit geometry injection.

Ax Eman M. AbouNassar, Amr Elshall, Sameh Abdulah 3/26/2026

FedPBS: Proximal-Balanced Scaling Federated Learning Model for Robust Personalized Training for Non-IID Data

Visual state representations for robotic agents using what-is-where composition for dynamic scene understanding.

Ax Mikoto Kudo, Takumi Tanabe, Akifumi Wachi, Youhei Akimoto 3/26/2026

Sample-Efficient Hypergradient Estimation for Decentralized Bi-Level Reinforcement Learning

FedPBS: federated learning algorithm for personalized training on non-IID data with improved robustness.

Ax Yeounoh Chung, Rushabh Desai, Jian He, Yu Xiao, Thibaud Hottelier, Yves-Laurent Kom Samo, Pushkar Kadilkar, Xianshun Chen, Sam Idicula, Fatma \"Ozcan, Alon Halevy, Yannis Papakonstantinou 3/26/2026

100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

Sample-efficient hypergradient estimation for decentralized bi-level reinforcement learning in strategic decision-making.

Ax Md. Asraful Haque, Aasar Mehdi, Maaz Mahboob, Tamkeen Fatima 3/26/2026

Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval

Proxy models reduce cost and latency of AI queries in SQL databases by 100x through approximation techniques.

Ax Sam Ganzfried 3/26/2026

Evolutionarily Stable Stackelberg Equilibrium

Domain-grounded tiered retrieval architecture to reduce LLM hallucinations through retrieval-based verification.

Ax Mohamed Youssef, Mayar Elfares, Anna-Maria Meer, Matteo Bortoletto, Andreas Bulling 3/26/2026

Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer

Evolutionarily Stable Stackelberg Equilibrium: game theory solution concept for asymmetric leader-follower games.

Ax Marcelo Fernandez (TraslaIA) 3/26/2026

Agent Control Protocol: Admission Control for Agent Actions

Ontology-Guided Diffusion for zero-shot sim2real transfer using neuro-symbolic approach to bridge simulation-reality gap.

Ax Lucas Maes, Quentin Le Lidec, Damien Scieur, Yann LeCun, Randall Balestriero 3/26/2026

LeWorldModel: Stable End-to-End Joint-Embedding Predictive Architecture from Pixels

Agent Control Protocol: formal specification for admission control governance of autonomous agents with cryptographic identity and policy compliance.

Ax Ravish Gupta (BigCommerce), Saket Kumar (University at Buffalo, The State University of New York, Buffalo, NY, USA), Shreeya Sharma (Microsoft), Maulik Dang (Amazon), Abhishek Aggarwal (Amazon) 3/26/2026

An Agentic Multi-Agent Architecture for Cybersecurity Risk Management

Multi-agent AI system with six specialized agents for automated NIST CSF-aligned cybersecurity risk assessments for small organizations.

Ax Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg, Tuhin Chakrabarty 3/26/2026

Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

Study showing finetuning bypasses LLM safety mechanisms and triggers verbatim recall of copyrighted training data.

Ax Trung V. Phan, Thomas Bauschert 3/26/2026

DeepXplain: XAI-Guided Autonomous Defense Against Multi-Stage APT Campaigns

Explainable DRL framework for autonomous APT defense using provenance-based graphs and stage-aware modeling.

Ax Shuai Wang, Yinan Yu, Earl Barr, Dhasarathy Parthasarathy 3/26/2026

LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study

LLM-based workflow system for multidisciplinary software development coordinating domain experts and developers in automotive.

Ax Hyoseok Park, Yeonsang Park 3/26/2026

PRISM: Breaking the O(n) Memory Wall in Long-Context LLM Inference via O(1) Photonic Block Selection

PRISM photonic accelerator approach reducing KV cache memory bandwidth from O(n) to O(1) for long-context LLM inference.

Ax Woosung Koh, Jeyoung Jeon, Youngjin Song, Yujin Cheon, Soowon Oh, Jaehyeong Choi, Se-Young Yun 3/26/2026

mSFT: Addressing Dataset Mixtures Overfitting Heterogeneously in Multi-task SFT

mSFT algorithm for optimizing heterogeneous multi-task SFT data mixtures by dynamically adjusting compute per sub-dataset.

Ax Yuze Qin, Qingyong Li, Zhiqing Guo, Wen Wang, Yan Liu, Yangli-ao Geng 3/26/2026

Extending Precipitation Nowcasting Horizons via Spectral Fusion of Radar Observations and Foundation Model Priors

Weather prediction combining radar observations with foundation model priors for extended nowcasting horizons.

Ax Junhyeok Rui Cha, Woohyun Cha, Jaeyong Shin, Donghyeon Kim, Jaeheung Park 3/26/2026

Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection

Sim-to-real transfer for humanoid robot control using state-dependent joint torque perturbations instead of domain randomization.

Ax Davide Bucciarelli, Evelyn Turri, Lorenzo Baraldi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara 3/26/2026

Tiny Inference-Time Scaling with Latent Verifiers

Inference-time scaling with lightweight latent verifiers instead of MLLMs to reduce computational cost in verification.

Ax Javier Ferrando, Enrique Lopez-Cuena, Pablo Agustin Martin-Torres, Daniel Hinjos, Anna Arias-Duart, Dario Garcia-Gasulla 3/26/2026

Language Models Can Explain Visual Features via Steering

Method using causal interventions and Vision-Language Models to explain sparse autoencoder features in vision models.

Ax Reza Habibi, Darian Lee, Magy Seif El-Nasr 3/26/2026

Beyond Accuracy: Introducing a Symbolic-Mechanistic Approach to Interpretable Evaluation

Interpretable evaluation combining symbolic rules with mechanistic interpretability to detect memorization vs genuine generalization.

Ax Haoyu Wang, Yuxin Chen, Liang Luo, Buyun Zhang, Ellie Dingqiao Wen, Pan Li 3/26/2026

Implicit Turn-Wise Policy Optimization for Proactive User-LLM Interaction

ITPO framework for optimizing multi-turn human-LLM interactions via RL despite sparse rewards and user stochasticity.

Ax Tuan-Anh Vu, S\'ebastien Destercke, Fr\'ed\'eric Pichon 3/26/2026

Upper Entropy for 2-Monotone Lower Probabilities

Theoretical analysis of upper entropy computation for credal sets and uncertainty quantification. Pure mathematics focus.

Ax Seungju Han, Konwoo Kim, Chanwoo Park, Benjamin Newman, Suhas Kotha, Jaehun Jung, James Zou, Yejin Choi 3/26/2026

Synthetic Mixed Training: Scaling Parametric Knowledge Acquisition Beyond RAG

Training method combining synthetic QA and document generation to improve LLM knowledge beyond RAG performance ceiling.

Ax Chenglin Li, Guangchun Ruan, Hua Geng 3/26/2026

Safe Reinforcement Learning with Preference-based Constraint Inference

Safe reinforcement learning framework inferring constraints from user preferences with minimal expert demonstrations.

Ax Jiehao Wu, Zixiao Huang, Wenhao Li, Chuyun Shen, Junjie Sheng, Xiangfeng Wang 3/26/2026

AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization

RL agent optimizing operator kernels on Huawei Ascend NPUs. Addresses knowledge gap in alternative hardware ecosystem.

Ax Stefania Stan, Marzio Lunghi, Vito Vargetto, Claudio Ricci, Rolands Repetto, Brayden Leo, Shao-Hong Gan 3/26/2026

Causal Reconstruction of Sentiment Signals from Sparse News Data

Causal signal reconstruction approach for converting sparse news sentiment into reliable time series for financial/tech analysis.

Ax Zhiyuan Chen, Yuxuan Zhong, Fan Wang, Bo Yu, Pengtao Shao, Shaoshan Liu, Ning Ding 3/26/2026

StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation

StateLinFormer model using linear attention for navigation agents with long-term memory. Addresses context window limitations in Transformers.

Ax Gaspard Abel, Eloi Campagne, Mohamed Benloughmari, Argyris Kalogeratos 3/26/2026

Dual-Criterion Curriculum Learning: Application to Temporal Data

Research on curriculum learning with dual criteria for temporal data. Proposes improved difficulty-based training scheduling.

Ax Tao Liu, Jiguang Lv, Dapeng Man, Weiye Xi, Yaole Li, Feiyu Zhao, Kuiming Wang, Yingchao Bian, Chen Xu, Wu Yang 3/26/2026

PoiCGAN: A Targeted Poisoning Based on Feature-Label Joint Perturbation in Federated Learning

PoiCGAN: Poisoning attack method against federated learning systems using feature-label joint perturbation.

Ax Meriem Bouzouad, Yuan-Hao Chang, Jalil Boukhobza 3/26/2026

APreQEL: Adaptive Mixed Precision Quantization For Edge LLMs

APreQEL: Adaptive mixed precision quantization technique for deploying large language models on edge devices with reduced memory and computational requirements.

Ax Long Zhang, Dai-jun Lin, Wei-neng Chen 3/26/2026

The Geometric Price of Discrete Logic: Context-driven Manifold Dynamics of Number Representations

Research on how LLMs form discrete decision boundaries within continuous semantic spaces through context-driven topological distortion of number representations.

Ax Yuqing Zhou, Ze Tao, Fujun Liu 3/26/2026

Residual Attention Physics-Informed Neural Networks for Robust Multiphysics Simulation of Steady-State Electrothermal Energy Systems

Physics-informed neural networks using residual attention for steady-state electrothermal multiphysics simulation in energy systems.

Ax Wei Sun, Ting Wang, Xinran Tian, Wanshun Lan, Xuhan Feng, Haoyue Li, Fangxin Wang 3/26/2026