Isolater - Feed

Ax Caglar Yildirim 3/18/2026

Differential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure

Investigation of how user personalization and mental health disclosure affect harmful behavior in tool-using LLM agents.

Ax Min Zeng, Shuang Zhou, Zaifu Zhan, Rui Zhang 3/18/2026

MedCL-Bench: Benchmarking stability-efficiency trade-offs and scaling in biomedical continual learning

Benchmark for evaluating continual learning in biomedical NLP across task-diverse datasets with robustness and efficiency metrics.

Ax Ruijiang Gao, Steven Chong Xiao 3/18/2026

Nonstandard Errors in AI Agents

Study of reproducibility in AI coding agents, showing agent-to-agent variation produces nonstandard errors in empirical results.

Ax Yongyuan Liang, Shijie Zhou, Yu Gu, Hao Tan, Gang Wu, Franck Dernoncourt, Jihyung Kil, Ryan A. Rossi, Ruiyi Zhang 3/18/2026

Anticipatory Planning for Multimodal AI Agents

Two-stage RL framework training multimodal agents for anticipatory reasoning and long-term planning in multi-step tasks.

Ax Swata Marik, Swayamjit Saha, Garga Chatterjee 3/18/2026

Beyond Accuracy: Evaluating Forecasting Models by Multi-Echelon Inventory Cost

Pipeline integrating forecasting models and ML regressors with inventory optimization, evaluated on M5 Walmart dataset.

Ax Yi Chen, Daiwei Chen, Sukrut Madhav Chikodikar, Caitlyn Heqi Yin, Ramya Korlakai Vinayak 3/18/2026

Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights

Evaluation of conformal factuality as reliability guarantee for RAG-based LLMs with novel metrics and robustness analysis.

Ax Zhitao Zeng, Mengya Xu, Jian Jiang, Pengfei Guo, Yunqiu Xu, Zhu Zhuo, Chang Han Low, Yufan He, Dong Yang, Chenxi Lin, Yiming Gu, Jiaxin Guo, Yutong Ban, Daguang Xu, Qi Dou, Yueming Jin 3/18/2026

Surg$\Sigma$: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence

Large-scale multimodal surgical dataset and foundation models for cross-procedure generalization in surgical AI tasks.

Ax Maksim Eren, Eric Michalak, Brian Cook, Johnny Seales Jr 3/18/2026

Prompt Programming for Cultural Bias and Alignment of Large Language Models

Study of cultural bias in LLMs and prompt-based methods to improve cultural alignment for policy and decision-making tasks.

Ax Karthik Ragunath Ananda Kumar, Subrahmanyam Arunachalam 3/18/2026

Learning to Present: Inverse Specification Rewards for Agentic Slide Generation

RL environment where LLM agents learn to generate professional presentations through research, planning, and tool use with multi-component reward system.

Ax Rui Ge, Yichao Fu, Yuyang Qian, Junda Su, Yiming Zhao, Peng Zhao, Hao Zhang 3/18/2026

Internalizing Agency from Reflective Experience

Method for training LLM agents to leverage rich environment feedback through reflective experience and post-training, improving long-horizon planning.

Ax Tianyu Xie, Jinfa Huang, Yuexiao Ma, Rongfang Luo, Yan Yang, Wang Chen, Yuhui Zeng, Ruize Fang, Yixuan Zou, Xiawu Zheng, Jiebo Luo, Rongrong Ji 3/18/2026

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Benchmark evaluating audio-visual social interactivity capabilities of omni-modal LLMs in dynamic dialogue settings.

Ax Chenyu Ge 3/18/2026

SAC-NeRF: Adaptive Ray Sampling for Neural Radiance Fields via Soft Actor-Critic Reinforcement Learning

RL framework using Soft Actor-Critic to learn adaptive ray sampling policies for efficient neural radiance field rendering.

Ax Suyash Mishra, Srikanth Patil, Satyanarayan Pati, Sagar Sahu, Baddu Narendra 3/18/2026

Finder: A Multimodal AI-Powered Search Framework for Pharmaceutical Data Retrieval

Multimodal AI search framework combining vector search, hybrid retrieval, and reasoning for pharmaceutical data across text, images, audio, and video.

Ax Yu Li, Yuchen Zheng, Giles Hamilton-Fletcher, Marco Mezzavilla, Yao Wang, Sundeep Rangan, Maurizio Porfiri, Zhou Yu, John-Ross Rizzo 3/18/2026

Exploring the Use of VLMs for Navigation Assistance for People with Blindness and Low Vision

Evaluation of VLMs (GPT-4V, Gemini, Claude, Llava) for navigation assistance tasks for people with vision impairments.

Ax Guangchen Lan 3/18/2026

Alternating Reinforcement Learning with Contextual Rubric Rewards

Framework extending RLHF using multi-dimensional rubric-based rewards instead of scalar signals for RL training.

Ax Zeyu Zhang, Xiangxiang Dai, Ziyi Han, Xutong Liu, John C. S. Lui 3/18/2026

Steering Frozen LLMs: Adaptive Social Alignment via Online Prompt Routing

Inference-time governance approach for LLMs using adaptive prompt routing to enable social alignment without retraining.

Ax Yue Chang, Guangsen Lin, Jyun Jie Chuang, Shunqi Liu, Xinkui Li, Yaozheng Li 3/18/2026

A federated learning framework with knowledge graph and temporal transformer for early sepsis prediction in multi-center ICUs

Federated learning framework integrating knowledge graphs and temporal transformers for early sepsis prediction in multi-center ICUs.

Ax Keivan Alizadeh, Parshin Shojaee, Minsik Cho, Mehrdad Farajtabar 3/18/2026

Recursive Language Models Meet Uncertainty: The Surprising Effectiveness of Self-Reflective Program Search for Long Context

Study on recursive language models with self-reflective program search for long-context handling, addressing information extraction challenges.

Ax Ruixi Lin 3/18/2026

Discovering the Hidden Role of Gini Index In Prompt-based Classification

Analysis of Gini Index role in prompt-based classification for detecting and optimizing class accuracy disparities in long-tailed datasets.

Ax Liu Hung Ming 3/18/2026

Beyond Reward Suppression: Reshaping Steganographic Communication Protocols in MARL via Dynamic Representational Circuit Breaking

Defense mechanism against steganographic collusion in multi-agent reinforcement learning using dynamic representational circuit breaking.

Ax Peiyu Yang, Naveed Akhtar, Jiantong Jiang, Ajmal Mian 3/18/2026

Attribution-Guided Model Rectification of Unreliable Neural Network Behaviors

Model rectification framework using attribution-guided rank-one editing to fix unreliable neural network behaviors on corrupted samples.

Ax Lansiaux Edouard, Leman Margaux 3/18/2026

OrthoAI v2: From Single-Agent Segmentation to Dual-Agent Treatment Planning for Clear Aligners

Open-source pipeline extending single-agent AI orthodontic treatment planning to dual-agent framework with improved tooth segmentation and landmarks.

Ax Alexis Kirke 3/18/2026

Quantum Amplitude Estimation for Catastrophe Insurance Tail-Risk Pricing: Empirical Convergence and NISQ Noise Analysis

Application of quantum amplitude estimation to catastrophe insurance tail-risk pricing with convergence analysis and NISQ noise effects.

Ax Kyle Dumont, Nicholas Herbert, Hayder Tirmazi, Shrikanth Upadhayaya 3/18/2026

DRCY: Agentic Hardware Design Reviews

AI agent system for hardware design reviews using LLMs to verify semantic correctness of component connections against datasheets.

Ax Alexandre Cristov\~ao Maiorano 3/18/2026

Automated Self-Testing as a Quality Gate: Evidence-Driven Release Management for LLM Applications

Framework for LLM application release management using automated self-testing with evidence-based quality gates across five dimensions.

Ax Yongzhong Xu 3/18/2026

Spectral Edge Dynamics of Training Trajectories: Signal--Noise Geometry Across Scales

Analysis of transformer training dynamics using Spectral Edge Dynamics to measure coherent optimization directions versus stochastic noise.

Ax Lingyun Zhang, Yu Xie, Ping Chen 3/18/2026

IdentityGuard: Context-Aware Restriction and Provenance for Personalized Synthesis

Context-aware safety framework for personalized text-to-image models that prevents misuse without broad concept erasure.

Ax Pengcheng Li, Jie Zhang, Tianwei Zhang, Han Qiu, Zhang kejun, Weiming Zhang, Nenghai Yu, Wenbo Zhou 3/18/2026

State-Dependent Safety Failures in Multi-Turn Language Model Interaction

Analysis of multi-turn safety failures in LLMs through state-space perspective, showing structured contextual evolution enables jailbreaks.

Ax Bingzhou Li, Tao Huang 3/18/2026

DASH: Dynamic Audio-Driven Semantic Chunking for Efficient Omnimodal Token Compression

Token compression method for omnimodal LLMs using dynamic audio-driven semantic chunking to reduce inference costs for audio-visual processing.

Ax Yubo Hou, Mohamed Ragab, Yucheng Wang, Min Wu, Abdulla Alseiari, Chee-Keong Kwoh, Xiaoli Li, Zhenghua Chen 3/18/2026

Evidential Domain Adaptation for Remaining Useful Life Prediction with Incomplete Degradation

Domain adaptation approach for remaining useful life prediction using evidential learning under incomplete degradation trajectories.

Ax Weihao Zhang, Yitong Zhou, Huanyu Qu, Hongyi Li 3/18/2026

Loosely-Structured Software: Engineering Context, Structure, and Evolution Entropy in Runtime-Rewired Multi-Agent Systems

Study on engineering challenges in LLM-based multi-agent systems, addressing context pressure, coordination errors, and system drift at scale.

Ax Ruyi Zhang, Heng Gao, Songlei Jian, Yusong Tan, Haifang Zhou 3/18/2026

BadLLM-TG: A Backdoor Defender powered by LLM Trigger Generator

Defense framework against backdoor attacks in LLMs using trigger generation and inversion to locate and mitigate malicious triggers.

Ax Mengyao Zhou, Zhiheng Zhou, Xiao Han, Xingqin Qi, Guanghui Wang, Guiying Yan 3/18/2026

Tackling Over-smoothing on Hypergraphs: A Ricci Flow-guided Neural Diffusion Approach

Study on over-smoothing in hypergraph neural networks using Ricci flow theory to improve message passing and layer depth handling.

Ax Lars Krupp, Daniel Gei{\ss}ler, Francisco M. Calatrava-Nicolas, Vishal Banwari, Paul Lukowicz, Jakob Karolus 3/18/2026

This Is Taking Too Long -- Investigating Time as a Proxy for Energy Consumption of LLMs

Research on using inference time as a proxy to estimate LLM energy consumption, addressing opacity in API-based model access and environmental impact.

Ax Yulin Peng, Haowen Hou, Xinxin Zhu, Ying Tiffany He, F. Richard Yu 3/18/2026

SEMAG: Self-Evolutionary Multi-Agent Code Generation

SEMAG: self-evolutionary multi-agent code generation framework that decomposes programming tasks into planning, coding, debugging stages with adaptive workflow selection.

Ax Ye Wang, Zixuan Wu, Lifeng Shen, Jiang Xie, Xiaoling Wang, Hong Yu, Guoyin Wang 3/18/2026

Mastering the Minority: An Uncertainty-guided Multi-Expert Framework for Challenging-tailed Sequence Learning

Uncertainty-guided multi-expert framework for imbalanced sequence learning addressing poor expert specialization and prediction conflicts in long-tailed data.

Ax AI Scientists, Xinyi Lin, Danqing Yin, Ying Guo 3/18/2026

LLM-Driven Discovery of High-Entropy Catalysts via Retrieval-Augmented Generation

Retrieval-augmented generation framework using GPT-4 to accelerate CO2 reduction catalyst discovery by exploring chemical spaces and interpreting results.

Ax Artem Sakhno, Ivan Sergeev, Alexey Shestov, Omar Zoloev, Elizaveta Kovtun, Gleb Gusev, Andrey Savchenko, Maksim Makarenko 3/18/2026

Embedding-Aware Feature Discovery: Bridging Latent Representations and Interpretable Features in Event Sequences

Method bridging learned embeddings and handcrafted features in event sequences for financial systems, addressing interpretability and latency constraints in production ML.

Ax Mateusz Dziemian, Maxwell Lin, Xiaohan Fu, Micha Nowak, Nick Winter, Eliot Jones, Andy Zou, Lama Ahmad, Kamalika Chaudhuri, Sahana Chennabasappa, Xander Davies, Lauren Deason, Benjamin L. Edelman, Tanner Emek, Ivan Evtimov, Jim Gust, Maia Hamin, Kat He, Klaudia Krawiecka, Riccardo Patana, Neil Perry, Troy Peterson, Xiangyu Qi, Javier Rando, Zifan Wang, Zihan Wang, Spencer Whitman, Eric Winsor, Arman Zharmagambetov, Matt Fredrikson, Zico Kolter 3/18/2026

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Large-scale competition analysis revealing LLM agents' vulnerability to indirect prompt injection attacks through adversarial instructions in external content sources.

Ax H. Sinan Bank, Daniel R. Herber 3/18/2026

A Framework and Prototype for a Navigable Map of Datasets in Engineering Design and Systems Engineering

Framework and prototype for navigable dataset map in engineering design and systems engineering to improve data accessibility and research reproducibility.

Ax Lit Sin Tan, Junzhe Chen, Xiaolong Fu, Lichen Ma, Junshi Huang, Jianzhong Shi, Yan Li, Lijie Wen 3/18/2026

Meta-TTRL: A Metacognitive Framework for Self-Improving Test-Time Reinforcement Learning in Unified Multimodal Models

Meta-TTRL: metacognitive test-time reinforcement learning framework for unified multimodal models enabling knowledge accumulation across similar prompts in text-to-image generation.

Ax MiroMind Team, S. Bai, L. Bing, L. Lei, R. Li, X. Li, X. Lin, E. Min, L. Su, B. Wang, L. Wang, L. Wang, S. Wang, X. Wang, Y. Zhang, Z. Zhang, G. Chen, L. Chen, Z. Cheng, Y. Deng, Z. Huang, D. Ng, J. Ni, Q. Ren, X. Tang, B. L. Wang, H. Wang, N. Wang, C. Wei, Q. Wu, J. Xia, Y. Xiao, H. Xu, X. Xu, C. Xue, Z. Yang, Z. Yang, F. Ye, H. Ye, J. Yu, C. Zhang, W. Zhang, H. Zhao, P. Zhu 3/18/2026