Isolater - Feed

Ax Angeliki Dimitriou, Nikolaos Chaidos, Maria Lymperaiou, Giorgos Filandrianos, Giorgos Stamou 5/6/2026

U-CECE: A Universal Multi-Resolution Framework for Conceptual Counterfactual Explanations

Proposes U-CECE, a model-agnostic framework for concept-based counterfactual explanations balancing expressivity and efficiency in AI model interpretability.

Ax Benjamin Maltbie, Shivam Raval 5/6/2026

Intersectional Sycophancy: How Perceived User Demographics Shape False Validation in Large Language Models

Studies how LLMs exhibit sycophantic behavior conditionally based on perceived user demographics across 128 personas in multi-turn conversations.

Ax Wangjie Gan, Miao Pan, Linbo Xi, Wenqi Zhang, Jintao Chen, Jianwei Yin, Xuhong Zhang 5/6/2026

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

Arxiv paper on GFT training method unifying supervised fine-tuning and reinforcement learning for LLMs via group advantages and dynamic coefficient rectification.

Ax Ha Thanh Nguyen, Wachara Fungwacharakorn, Sabine Wehnert, May Myo Zin, Yuntao Kong, Jieying Xue, Micha{\l} Araszkiewicz, Randy Goebel, Ken Satoh 5/6/2026

GDPR Auto-Formalization with AI Agents and Human Verification

Arxiv paper on automatic GDPR formalization using multi-agent LLM workflow with role-specialized components and human-in-the-loop verification.

Ax Ke Xu, Yuhao Wang, Yu Wang 5/6/2026

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Arxiv paper introducing ProVoice-Bench, first evaluation framework for proactive voice agents with four novel tasks beyond reactive paradigms.

Ax Tianbao Zhang 5/6/2026

Harness as an Asset: Enforcing Determinism via the Convergent AI Agent Framework (CAAF)

Arxiv paper on Convergent AI Agent Framework transitioning agentic workflows from open-loop to closed-loop control for safety-critical engineering applications.

Ax Ifdita Hasan Orney, Jubayer Ibn Hamid, Shreya S Ramanujam, Shirley Wu, Hengyuan Hu, Noah Goodman, Dorsa Sadigh, Chelsea Finn 5/6/2026

Poly-EPO: Training Exploratory Reasoning Models

Arxiv paper on Poly-EPO framework for post-training language models to encourage optimistic exploration and balance exploration-exploitation trade-offs.

Ax Tao Zhang, Kaixian Qu, Zhibin Li, Jiajun Wu, Marco Hutter, Manling Li, Fan Shi 5/6/2026

Using large language models for embodied planning introduces systematic safety risks

Arxiv paper evaluating safety risks of LLMs as robotic planners using DESPITE benchmark with 12,279 tasks spanning physical and normative dangers.

Ax Kevin Murphy 5/6/2026

Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs

Arxiv paper on Bayesian Linguistic Forecaster, an agentic system using linguistic belief states and iterative tool-use for state-of-the-art binary forecasting.

Ax Haebin Seong, Li Yin, Haoran Zhang, Zhan Shi 5/6/2026

The Last Harness You'll Ever Build

Arxiv paper on universal harness framework for AI agents navigating complex domain-specific workflows without painstaking task-specific engineering.

Ax Andrew Shin 5/6/2026

AI-Gram: When Visual Agents Interact in a Social Network

Arxiv paper on AI-Gram, a live social network platform where autonomous LLM agents generate and respond to visual content with persistent relationships.

Ax Aofan Liu, Jingxiang Meng 5/6/2026

Self-Correction as Feedback Control: Error Dynamics, Stability Thresholds, and Prompt Interventions in LLMs

Arxiv paper modeling self-correction in agentic LLM systems as feedback control; analyzes error dynamics and stability thresholds via Markov models.

Ax Alexander Bering 5/6/2026

ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems

Arxiv paper on ZenBrain, a 7-layer memory architecture for autonomous AI systems achieving high accuracy with 106x lower token costs vs long-context baselines.

Ax Maximiliano Armesto, Christophe Kolb 5/6/2026

Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents

Arxiv paper proposing intent compilation framework for transforming partially-specified human purpose into inspectable AI agent specifications for open-world deployment.

Ax Sidi Chang, Peiying Zhu, Yuxiao Chen 5/6/2026

ValueBlindBench: Agreement-Gated Stress Testing of LLM-Judged Investment Rationales Before Returns Are Observable

Arxiv paper on ValueBlindBench, a stress-testing framework for LLM-judged investment rationales before observable returns; addresses delayed-ground-truth evaluation.

Ax Shuzheng Si, Haozhe Zhao, Yu Lei, Qingyi Wang, Dingwei Chen, Zhitong Wang, Zhenhailong Wang, Kangyang Luo, Zheng Wang, Gang Chen, Fanchao Qi, Minjia Zhang, Maosong Sun 5/6/2026

From Context to Skills: Can Language Models Learn from Context Skillfully?

Arxiv paper on context learning in language models via inference-time skill augmentation for reasoning over complex contexts exceeding parametric knowledge.

Ax Christian Intern\`o, Elena Raponi, Markus Olhofer, Ali Raza, Thomas B\"ack, Niki van Stein, Yaochu Jin, Barbara Hammer 5/6/2026

Pruning Federated Models through Loss Landscape Analysis and Client Agreement Scoring

AutoFLIP framework for federated learning model pruning using loss landscape analysis and client agreement scoring.

Ax Zhensu Sun, Haotian Zhu, Bowen Xu, Xiaoning Du, Li Li, David Lo 5/6/2026

Towards Agentic Runtime Healing

Paper on using LLMs for automated runtime healing in self-healing systems, replacing predefined rules with adaptive error recovery.

Ax Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova 5/6/2026

GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data

GraphLand benchmark for evaluating graph neural networks on diverse industrial datasets beyond academic citation networks.

Ax Samuel J. Bell, Skyler Wang 5/6/2026

The Pragmatic Frames of Spurious Correlations in Machine Learning: Interpreting How and Why They Matter

Analysis of spurious correlations in ML models, examining how unintended patterns affect performance, fairness, and robustness.

Ax Mingchao Liu, Yu Sun, Ruixiao Sun, Xin Dong, Xiang Shen, Hongwei Wang, Hongyu Xiong, Yang Song 5/6/2026

IPS: In-Prompt Process Supervision for Short Video Content Moderation

IPS framework integrating process supervision into MLLMs for improved short video content moderation via sequential reasoning.

Ax Tairan Fu, Javier Conde, Gonzalo Mart\'inez, Mar\'ia Grandury, Pedro Reviriego 5/6/2026

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong

Study on how reasoning approaches affect LLM confidence in multiple choice questions, showing overconfidence with reasoning.

Ax Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, Refdinal Tubagus 5/6/2026

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

Comparative review of YOLO object detection architectures from YOLOv8 to YOLO11, analyzing architecture evolution.

Ax Xuan Shen, Yizhou Wang, Yufa Zhou, Xiangxi Shi, Pu Zhao, Yanzhi Wang, Jiuxiang Gu 5/6/2026

Efficient Reasoning with Hidden Thinking

Research on Heima framework that compresses chain-of-thought reasoning in MLLMs into abstract thinking tokens for efficiency.

Ax Peihan Li, Zijian An, Shams Abrar, Lifeng Zhou 5/6/2026

Large Language Models for Multi-Robot Systems: A Survey

Survey of LLM integration into multi-robot systems, covering communication, task allocation, planning, and human-robot interaction.

Ax Susung Hong, Ira Kemelmacher-Shlizerman, Brian Curless, Steven M. Seitz 5/6/2026

MusicInfuser: Making Video Diffusion Listen and Dance

Research paper on aligning pre-trained video diffusion models to generate dance videos synchronized with music input.

Ax Alexandra Bazarova, Aleksandr Yugay, Andrey Shulga, Alina Ermilova, Andrei Volodichev, Konstantin Polev, Julia Belikova, Rauf Parchiev, Dmitry Simakov, Maxim Savchenko, Andrey Savchenko, Serguei Barannikov, Alexey Zaytsev 5/6/2026

Hallucination Detection in LLMs with Topological Divergence on Attention Graphs

TOHA detector for identifying LLM hallucinations in RAG systems by analyzing topological divergence patterns in attention graph structures.

Ax Jiongli Zhu, Yue Wang, Bailu Ding, Philip A. Bernstein, Vivek Narasayya, Surajit Chaudhuri 5/6/2026

MINT: Multi-Vector Search Index Tuning

MINT framework for tuning index selection strategies in multi-vector databases to optimize performance across multiple feature dimensions.

Ax Mihai Nadas, Laura Diosan, Andrei Piscoran, Andreea Tomescu 5/6/2026

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

TF1-EN-3M: Open dataset of 3 million synthetic English moral fables generated by small language models for training open-source LLMs.

Ax Ren Zhuang 5/6/2026

Adaptive GoGI-Skip: Coupling Goal-Gradient Importance with Dynamic Uncertainty for Efficient Reasoning

Adaptive GoGI-Skip framework coupling goal-gradient importance with dynamic skipping to reduce LLM inference latency while preserving reasoning accuracy.

Ax Yukun Zhang, Qi Dong, Mengkang Li 5/6/2026

Latent Trajectory Dynamics in Large Language Models: A Manifold Evolution Framework with Empirical Validation

Dynamical Manifold Evolution Theory framework modeling LLM token generation as controlled dynamical system evolution on low-dimensional semantic manifolds.

Ax Chen Xiong, Zihao Wang, Rui Zhu, Tsung-Yi Ho, Pin-Yu Chen, Jingwei Xiong, Haixu Tang 5/6/2026

Hey, That's My Data! Token-Only Dataset Inference in Large Language Models

CatShift framework for inferring LLM training datasets using only token predictions, enabling copyright/privacy analysis without internal model access.

Ax Zheda Mai, Arpita Chowdhury, Zihe Wang, Sooyoung Jeon, Lemeng Wang, Jiacheng Hou, Jihyung Kil, Wei-Lun Chao 5/6/2026

AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

Systematic benchmark (AVA-Bench) for evaluating vision foundation models on atomic visual abilities independent of LLM pairing or instruction tuning bias.

Ax Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Reduan Achtibat, Patrick Kahardipraja, Thomas Wiegand, Wojciech Samek, Alexander Binder, Sebastian Lapuschkin 5/6/2026

Attribution-Guided Pruning for Insight and Control: Circuit Discovery and Targeted Correction in Small-scale LLMs

Mechanistic interpretability method using attribution-guided pruning to discover and correct specific behavior circuits in small-scale LLMs.

Ax Kateryna Lutsai, Pavel Stra\v{n}\'ak 5/6/2026

Page image classification for content-specific data processing

Automated classification system for historical document page images to categorize diverse content types including text, graphics, and layouts.

Ax Ailiang Lin, Zhuoyun Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura 5/6/2026

Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

Causal2Vec improves decoder-only LLMs as embedding models using contextual tokens, preserving unidirectional attention while overcoming causal attention representation limitations.

Ax Sung-Hyun Kim, Geum-Hwan Hwang, In-Chang Baek, Seo-Young Lee, Kyung-Joong Kim 5/6/2026

Multi-Objective Instruction-Aware Representation Learning in Procedural Content Generation RL

Instruction-aware representation learning for procedural content generation in RL, improving controllability through better leverage of natural language instructions.

Ax Bokeng Zheng, Jianqiang Zhong, Jiayi Liu, Lei Xue, Xu Chen, Xiaoxi Zhang 5/6/2026

Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks

Decentralized federated fine-tuning approach for foundation models in IoV edge networks under energy constraints with heterogeneous task demands.

Ax Seonglae Cho, Zekun Wu, Adriano Koshiyama 5/6/2026

CorrSteer: Generation-Time LLM Steering via Correlated Sparse Autoencoder Features

CorrSteer steers LLM generation at inference time by selecting interpretable sparse autoencoder features correlated with token correctness, without requiring contrastive datasets.

Ax Weihang Su, Anzhe Xie, Qingyao Ai, Jianming Long, Xuanyi Chen, Jiaxin Mao, Ziyi Ye, Yiqun Liu 5/6/2026

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

SurGE benchmark and evaluation framework for automated scientific survey generation using LLMs, addressing standardization gaps in literature synthesis automation.

Ax Jiaqi Chen, Yanzhe Zhang, Yutong Zhang, Yijia Shao, Diyi Yang 5/6/2026

Generative Interfaces for Language Models

Proposes generative interfaces paradigm to move LLM interactions beyond linear request-response format for more efficient multi-turn, information-dense, and exploratory tasks.

Ax Sanjeeevan Selvaganapathy, Mehwish Nasim 5/6/2026

Confident, Calibrated, or Complicit: Safety Alignment and Ideological Bias in LLM Hate Speech Detection

Study evaluates safety-aligned versus uncensored LLMs on hate speech detection, revealing trade-offs between model censoring and detection performance across political personas.

Ax Shanglin Wu, Lihui Liu, Jinho D. Choi, Kai Shu 5/6/2026

Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Method improves LLM factuality by constructing knowledge graphs at inference time instead of using unstructured text retrieval, enhancing reasoning and reducing irrelevant information influence.

Ax Kai R. Larsen, Sen Yan, Roland M. Mueller, Lan Sang, Mikko R\"onkk\"o, Ravi Starzl, Donald Edmondson 5/6/2026

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

ALIGNS: LLM-based approach for building nomological networks in psychological measurement to establish construct validity.

Ax Yifan Liu, Yaokun Liu, Zelin Li, Zhenrui Yue, Gyuseok Lee, Ruichen Yao, Yang Zhang, Dong Wang 5/6/2026

Learning Decomposed Contextual Token Representations from Pretrained and Collaborative Signals for Generative Recommendation

Method for generative recommenders learning decomposed contextual token representations combining pretrained and collaborative signals.

Ax Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, Giuseppe Ateniese 5/6/2026