Isolater - Feed

Ax Abhinaba Basu, Pavan Chakraborty 3/20/2026

When Names Change Verdicts: Intervention Consistency Reveals Systematic Bias in LLM Decision-Making

ICE-Guard framework detects spurious feature reliance in LLMs for high-stakes decisions through intervention consistency testing on demographic, authority, and framing biases.

Ax Andrew Choi, Xinjie Wang, Zhizhong Su, Wei Xu 3/20/2026

Scaling Sim-to-Real Reinforcement Learning for Robot VLAs with Generative 3D Worlds

Method for scaling vision-language-action robot learning using generative 3D worlds to address sim-to-real gap.

Ax Haonan Ping, Jian Jiang, Cheng Yuan, Qizhen Sun, Lv Wu, Yutong Ban 3/20/2026

SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement

SCISSR: Scribble-based interactive framework for surgical scene segmentation using SAM-style prompting.

Ax Xiang Chen, Fangfang Yang, Chunlei Meng, Chengyin Hu, Ang Li, Yiwei Wei, Jiahuan Long, Jiujiang Guo 3/20/2026

CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models

CoDA explores adversarial attacks on medical vision-language models and proposes token-space repair methods.

Ax Dan Ben-Ami, Gabriele Serussi, Kobi Cohen, Chaim Baskin 3/20/2026

HiMu: Hierarchical Multimodal Frame Selection for Long Video Question Answering

HiMu hierarchical frame selection method for long video question answering with vision-language models.

Ax Hoang T. H. Cao, Hai D. V. Trinh, Tho Quan, Lan V. Truong 3/20/2026

Transformers Learn Robust In-Context Regression under Distributional Uncertainty

Study showing Transformers learn robust in-context regression under distributional uncertainty without restrictive assumptions.

Ax Shenggui Li, Chao Wang, Yikai Zhu, Yubo Wang, Fan Yin, Shuai Shi, Yefei Chen, Xiaomin Dong, Qiaoling Chen, Jin Pan, Ji Li, Laixin Xie, Yineng Zhang, Lei Yu, Yonggang Wen, Ivor Tsang, Tianwei Zhang 3/20/2026

SpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative Decoding

SpecForge: Open-source production framework for training draft models used in speculative decoding to reduce LLM inference latency.

Ax Abhinaba Basu, Pavan Chakraborty 3/20/2026

ICE: Intervention-Consistent Explanation Evaluation with Statistical Grounding for LLMs

ICE framework evaluates LLM explanation faithfulness using statistical intervention testing with randomization baselines.

Ax Xuan Liu, Xiaobin Chang 3/20/2026

Elastic Weight Consolidation Done Right for Continual Learning

Systematic analysis and improvements to Elastic Weight Consolidation for continual learning to better estimate weight importance.

Ax Ye Kyaw Thu, Thazin Myint Oo, Thepchai Supnithi 3/20/2026

myMNIST: Benchmark of PETNN, KAN, and Classical Deep Learning Models for Burmese Handwritten Digit Recognition

Benchmark comparing PETNN, KAN, and classical deep learning models on myMNIST Burmese handwritten digit recognition dataset.

Ax Xin Li, Shiming Yu, Leming Shen, Jianing Zhang, Yuanqing Zheng, Yaxiong Xie 3/20/2026

AutORAN: LLM-driven Natural Language Programming for Agile xApp Development

AutORAN uses LLMs for natural language programming to simplify xApp development in Open Radio Access Networks.

Ax Xiaoyin Chen, Canwen Xu, Yite Wang, Boyi Liu, Zhewei Yao, Yuxiong He 3/20/2026

Learning to Self-Evolve

LSE framework trains LLMs to self-improve during inference by iteratively refining context based on problem feedback.

Ax Bin Cao, Sipeng Zheng, Hao Luo, Boyuan Li, Jing Liu, Zongqing Lu 3/20/2026

OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data

OpenT2M: Million-scale open-source dataset with 2800+ hours of motion data for text-to-motion generation in animation and robotics.

Ax Shuqi Xiao, Maani Ghaffari, Chengzhong Xu, Hui Kong 3/20/2026

REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation

REST algorithm for zero-shot object-goal navigation using receding horizon planning and Steiner trees for generating subgoal candidates in unknown environments.

Ax J\'an Mikulec, Jakub Breier, Xiaolu Hou 3/20/2026

Beyond TVLA: Anderson-Darling Leakage Assessment for Neural Network Side-Channel Leakage Detection

Anderson-Darling leakage assessment method for detecting side-channel leakage in neural networks, improving on TVLA's mean-based approach.

Ax Pius Horn, Janis Keuper 3/20/2026

Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation

Benchmarking framework for PDF table extraction using LLM-based semantic evaluation on synthetically generated PDFs with LaTeX ground truth.

Ax Jingguo Qu, Xinyang Han, Yao Pu, Man-Lik Chui, Simon Takadiyi Gunda, Ziman Chen, Jing Qin, Ann Dorothy King, Winnie Chiu-Wing Chu, Jing Cai, Michael Tin-Cheung Ying 3/20/2026

Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image Segmentation

SSL framework for medical ultrasound image segmentation using contrastive learning with multiscale switching to handle limited labeled data and imaging artifacts.

Ax Eduardo Di Santi 3/20/2026

Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework

Mathematical framework distinguishing cognitive amplification from cognitive delegation in human-AI systems for measuring AI impact on human reasoning.

Ax Zhicong Lu, Zichuan Lin, Wei Jia, Changyuan Tian, Deheng Ye, Peiguang Li, Li Jin, Nayu Liu, Guangluan Xu, Wei Feng 3/20/2026

HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning

HISR framework improving multi-turn agentic reinforcement learning through hindsight information modulation and segmental process rewards for complex long-horizon tasks.

Ax Mohamed Youssef, Mayar Elfares, Anna-Maria Meer, Matteo Bortoletto, Andreas Bulling 3/20/2026

Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer

Neuro-symbolic sim2real image translation framework using structured ontology-guided diffusion for zero-shot domain transfer without labeled real data.

Ax Hao Wang, Licheng Pan, Zhichao Chen, Chunyuan Zheng, Zhixuan Chu, Xiaoxi Li, Yuan Lu, Xinggao Liu, Haoxuan Li, Zhouchen Lin 3/20/2026

CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks

CausalRM method for learning reward models from observational user feedback (clicks, upvotes) as scalable alternative to controlled RLHF annotation.

Ax Dimitris Mitropoulos, Nikolaos Alexopoulos, Georgios Alexopoulos, Diomidis Spinellis 3/20/2026

Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review

Study measuring confirmation bias in LLM-based security code review systems and its exploitability in software supply-chain attacks.

Ax Isabel Rio-Torto, Jaime S. Cardoso, Lu\'is F. Teixeira 3/20/2026

WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification

Weakly supervised method for generating natural language explanations in chest X-ray classification without explicit explanation annotations.

Ax Gabriele Carrino, Andrea Sassella, Nicolo Brunello, Federico Toschi, Mark James Carman 3/20/2026

Are complicated loss functions necessary for teaching LLMs to reason?

Ablation study of Group Relative Policy Optimization components for LLM reasoning training, questioning necessity of complex loss functions.

Ax Haochen Zhao, Shaoyang Cui 3/20/2026

ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation

ClawTrap MITM-based red-teaming framework for evaluating security robustness of autonomous web agents like OpenClaw against network-layer threats.

Ax Channe Chwa, Xinle Wu, Yao Lu 3/20/2026

Automatic Configuration of LLM Post-Training Pipelines

AutoPipe framework for automated configuration of LLM post-training pipelines combining supervised fine-tuning and reinforcement learning under budget constraints.

Ax Jiatong Xia, Zicheng Duan, Anton van den Hengel, Lingqiao Liu 3/20/2026

Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors

Diffusion-based 3D generation framework leveraging point cloud priors as geometric constraints for improved structure-aware object synthesis.

Ax KT Tech innovation Group 3/20/2026

Mi:dm K 2.5 Pro

32B parameter Korean-language LLM optimized for enterprise reasoning, long-context understanding, and agentic workflows with domain-specific capabilities.

Ax Zikang Ding, Junhao Li, Suling Wu, Junchi Yao, Hongbo Liu, Lijie Hu 3/20/2026

Functional Subspace Watermarking for Large Language Models

Watermarking method for LLM ownership protection using functional subspaces, robust against fine-tuning, quantization, and knowledge distillation.

Ax Yuchen Li, Amanmeet Garg, Shalini Chaudhuri, Rui Zhao, Garin Kessler 3/20/2026

Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation

Vision-language model enhanced with explicit spatial token generation for improved 2D/3D spatial reasoning and fine-grained grounding.

Ax Tudor-Dan Mihoc, Manuela-Andreea Petrescu, Emilia-Loredana Pop 3/20/2026

Student views in AI Ethics and Social Impact

Survey of 230 computer science students on ethical implications and societal impacts of AI from a gender perspective.

Ax Marcelo Fernandez (TraslaIA) 3/20/2026

Agent Control Protocol: Admission Control for Agent Actions

Formal specification for cryptographic admission control governing autonomous agent actions in institutional B2B environments, validating identity and policy compliance.

Ax Bishoy Galoaa, Shayda Moezzi, Xiangyu Bai, Sarah Ostadabbas 3/20/2026

Motion-o: Trajectory-Grounded Video Reasoning

Video reasoning model using trajectory and motion information for improved spatio-temporal inference in video understanding tasks.

Ax Nelson Navajas Fern\'andez, Jeffrey T. Hancock, Maurice Jakesch 3/20/2026

Through the Looking-Glass: AI-Mediated Video Communication Reduces Interpersonal Trust and Confidence in Judgments

Study on how AI-mediated video communication affects trust and credibility detection. Social impact of AI, limited technical content.

Ax Carlos Rafael Catalan, Patricia Nicole Monderin, Lheane Marie Dizon, Gap Estrella, Raymund John Sarmimento, Marie Antoinette Patalagsa 3/20/2026

Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo

Case study evaluating LLM-generated lessons in Duolingo for language learning. LLM application assessment with limited technical depth.

Ax Youngwan Lee, Soojin Jang, Yoorhim Cho, Seunghwan Lee, Yong-Ju Lee, Sung Ju Hwang 3/20/2026

MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model

MultihopSpatial benchmark for multi-hop spatial reasoning in Vision-Language agents. Evaluation dataset for VLA agents.

Ax Min Hun Lee 3/20/2026

From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making

Framework proposing readiness metrics for human-AI decision-making teams beyond accuracy. Evaluation methodology for AI collaboration.

Ax Yitong Li, Igor Yakushev, Dennis M. Hedderich, Christian Wachinger 3/20/2026

Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness

Conditional diffusion models translating MRI to PET for medical imaging. ML for healthcare, not AI/agent related.

Ax Yifan Sui, Han Zhao, Rui Ma, Zhiyuan He, Hao Wang, Jianxun Li, Yuqing Yang 3/20/2026

Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution

PASTE: Pattern-Aware Speculative Tool Execution to reduce latency in LLM agent tool loops. Optimization for agentic workflows.

Ax Vedant Pandya 3/20/2026

Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs

XKD-Dial: four-stage training pipeline for citation-grounded dialogue reducing hallucination in English-Hindi LLMs. LLM application addressing hallucination.

Ax Shiliang Zhang, Sabita Maharjan 3/20/2026

Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections

arXiv paper examining regulatory frameworks for agentic AI security and privacy. Policy analysis of AI agent governance.

Ax A. A. Saoulis, T. -S. Pham, A. M. G. Ferreira 3/20/2026

Improving moment tensor solutions under Earth structure uncertainty with simulation-based inference

Simulation-based inference for moment tensor inversions in seismology. ML method applied to geophysics, not AI/agent focused.

Ax Chenxi Han, Shilu He, Yi Cheng, Linqi Ye, Houde Liu 3/20/2026

PRIOR: Perceptive Learning for Humanoid Locomotion with Reference Gait Priors

PRIOR framework for humanoid locomotion with natural gaits using Isaac Lab. ML for robotics, not core AI agent/LLM focus.

Ax An Luo, Jin Du, Xun Xian, Robert Specht, Fangqiao Tian, Ganghua Wang, Xuan Bi, Charles Fleming, Ashish Kundu, Jayanth Srinivasa, Mingyi Hong, Rui Zhang, Tianxi Li, Galin Jones, Jie Ding 3/20/2026

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

Benchmark evaluating AI agent performance on domain-specific data science tasks against human expert baselines across multiple domains.

Ax Hangeol Chang, Changsun Lee, Seungjoon Rho, Junho Yeo, Jong Chul Ye 3/20/2026

Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval

RAG method using hypothesis-conditioned query rewriting to retrieve decision-relevant evidence for choice tasks beyond topical relevance.

Ax Enrico Bottazzi, Pia Park 3/20/2026

Security awareness in LLM agents: the NDAI zone case

Framework enabling LLM agents to recognize secure trusted execution environments for secure IP disclosure negotiations.

Ax Gagan Bhatia, Ahmad Muhammad Isa, Maxime Peyrard, Wei Zhao 3/20/2026

What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Multilingual temporal reasoning benchmark with 15K examples across 5 languages testing LLM capabilities on date arithmetic and temporal relations.

Ax Quentin Guimard, Federico Bartsch, Simone Caldarella, Rahaf Aljundi, Elisa Ricci, Massimiliano Mancini 3/20/2026

SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models

Post-hoc debiasing method for vision-language models like CLIP using sparse embedding modulation to separate bias from semantic information.

Ax Yikai Zheng, Xin Ding, Yifan Yang, Shiqi Jiang, Hao Wu, Qianxi Zhang, Weijun Wang, Ting Cao, Yunxin Liu 3/20/2026

Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding

Streaming video understanding framework that decouples semantic understanding from perception for proactive query handling.

Ax Qiawen Ella Liu, Raja Marjieh, Jian-Qiao Zhu, Adele E. Goldberg, Thomas L. Griffiths 3/20/2026

Parallelograms Strike Back: LLMs Generate Better Analogies than People

Study comparing LLM-generated analogies to human-produced ones using geometric parallelogram model of analogical relations.