Isolater - Feed

Ax Maximiliano Armesto, Christophe Kolb 5/6/2026

Toward a Science of Intent: Closure Gaps and Delegation Envelopes for Open-World AI Agents

Arxiv paper proposing intent compilation framework for transforming partially-specified human purpose into inspectable AI agent specifications for open-world deployment.

Ax Sidi Chang, Peiying Zhu, Yuxiao Chen 5/6/2026

ValueBlindBench: Agreement-Gated Stress Testing of LLM-Judged Investment Rationales Before Returns Are Observable

Arxiv paper on ValueBlindBench, a stress-testing framework for LLM-judged investment rationales before observable returns; addresses delayed-ground-truth evaluation.

Ax Shuzheng Si, Haozhe Zhao, Yu Lei, Qingyi Wang, Dingwei Chen, Zhitong Wang, Zhenhailong Wang, Kangyang Luo, Zheng Wang, Gang Chen, Fanchao Qi, Minjia Zhang, Maosong Sun 5/6/2026

From Context to Skills: Can Language Models Learn from Context Skillfully?

Arxiv paper on context learning in language models via inference-time skill augmentation for reasoning over complex contexts exceeding parametric knowledge.

Ax Christian Intern\`o, Elena Raponi, Markus Olhofer, Ali Raza, Thomas B\"ack, Niki van Stein, Yaochu Jin, Barbara Hammer 5/6/2026

Pruning Federated Models through Loss Landscape Analysis and Client Agreement Scoring

AutoFLIP framework for federated learning model pruning using loss landscape analysis and client agreement scoring.

Ax Zhensu Sun, Haotian Zhu, Bowen Xu, Xiaoning Du, Li Li, David Lo 5/6/2026

Towards Agentic Runtime Healing

Paper on using LLMs for automated runtime healing in self-healing systems, replacing predefined rules with adaptive error recovery.

Ax Gleb Bazhenov, Oleg Platonov, Liudmila Prokhorenkova 5/6/2026

GraphLand: Evaluating Graph Machine Learning Models on Diverse Industrial Data

GraphLand benchmark for evaluating graph neural networks on diverse industrial datasets beyond academic citation networks.

Ax Samuel J. Bell, Skyler Wang 5/6/2026

The Pragmatic Frames of Spurious Correlations in Machine Learning: Interpreting How and Why They Matter

Analysis of spurious correlations in ML models, examining how unintended patterns affect performance, fairness, and robustness.

Ax Mingchao Liu, Yu Sun, Ruixiao Sun, Xin Dong, Xiang Shen, Hongwei Wang, Hongyu Xiong, Yang Song 5/6/2026

IPS: In-Prompt Process Supervision for Short Video Content Moderation

IPS framework integrating process supervision into MLLMs for improved short video content moderation via sequential reasoning.

Ax Tairan Fu, Javier Conde, Gonzalo Mart\'inez, Mar\'ia Grandury, Pedro Reviriego 5/6/2026

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident, Especially When They are Wrong

Study on how reasoning approaches affect LLM confidence in multiple choice questions, showing overconfidence with reasoning.

Ax Priyanto Hidayatullah, Nurjannah Syakrani, Muhammad Rizqi Sholahuddin, Trisna Gelar, Refdinal Tubagus 5/6/2026

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

Comparative review of YOLO object detection architectures from YOLOv8 to YOLO11, analyzing architecture evolution.

Ax Xuan Shen, Yizhou Wang, Yufa Zhou, Xiangxi Shi, Pu Zhao, Yanzhi Wang, Jiuxiang Gu 5/6/2026

Efficient Reasoning with Hidden Thinking

Research on Heima framework that compresses chain-of-thought reasoning in MLLMs into abstract thinking tokens for efficiency.

Ax Peihan Li, Zijian An, Shams Abrar, Lifeng Zhou 5/6/2026

Large Language Models for Multi-Robot Systems: A Survey

Survey of LLM integration into multi-robot systems, covering communication, task allocation, planning, and human-robot interaction.

Ax Susung Hong, Ira Kemelmacher-Shlizerman, Brian Curless, Steven M. Seitz 5/6/2026

MusicInfuser: Making Video Diffusion Listen and Dance

Research paper on aligning pre-trained video diffusion models to generate dance videos synchronized with music input.

Ax Alexandra Bazarova, Aleksandr Yugay, Andrey Shulga, Alina Ermilova, Andrei Volodichev, Konstantin Polev, Julia Belikova, Rauf Parchiev, Dmitry Simakov, Maxim Savchenko, Andrey Savchenko, Serguei Barannikov, Alexey Zaytsev 5/6/2026

Hallucination Detection in LLMs with Topological Divergence on Attention Graphs

TOHA detector for identifying LLM hallucinations in RAG systems by analyzing topological divergence patterns in attention graph structures.

Ax Jiongli Zhu, Yue Wang, Bailu Ding, Philip A. Bernstein, Vivek Narasayya, Surajit Chaudhuri 5/6/2026

MINT: Multi-Vector Search Index Tuning

MINT framework for tuning index selection strategies in multi-vector databases to optimize performance across multiple feature dimensions.

Ax Mihai Nadas, Laura Diosan, Andrei Piscoran, Andreea Tomescu 5/6/2026

TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

TF1-EN-3M: Open dataset of 3 million synthetic English moral fables generated by small language models for training open-source LLMs.

Ax Ren Zhuang 5/6/2026

Adaptive GoGI-Skip: Coupling Goal-Gradient Importance with Dynamic Uncertainty for Efficient Reasoning

Adaptive GoGI-Skip framework coupling goal-gradient importance with dynamic skipping to reduce LLM inference latency while preserving reasoning accuracy.

Ax Yukun Zhang, Qi Dong, Mengkang Li 5/6/2026

Latent Trajectory Dynamics in Large Language Models: A Manifold Evolution Framework with Empirical Validation

Dynamical Manifold Evolution Theory framework modeling LLM token generation as controlled dynamical system evolution on low-dimensional semantic manifolds.

Ax Chen Xiong, Zihao Wang, Rui Zhu, Tsung-Yi Ho, Pin-Yu Chen, Jingwei Xiong, Haixu Tang 5/6/2026

Hey, That's My Data! Token-Only Dataset Inference in Large Language Models

CatShift framework for inferring LLM training datasets using only token predictions, enabling copyright/privacy analysis without internal model access.

Ax Zheda Mai, Arpita Chowdhury, Zihe Wang, Sooyoung Jeon, Lemeng Wang, Jiacheng Hou, Jihyung Kil, Wei-Lun Chao 5/6/2026

AVA-Bench: Atomic Visual Ability Benchmark for Vision Foundation Models

Systematic benchmark (AVA-Bench) for evaluating vision foundation models on atomic visual abilities independent of LLM pairing or instruction tuning bias.

Ax Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer, Reduan Achtibat, Patrick Kahardipraja, Thomas Wiegand, Wojciech Samek, Alexander Binder, Sebastian Lapuschkin 5/6/2026

Attribution-Guided Pruning for Insight and Control: Circuit Discovery and Targeted Correction in Small-scale LLMs

Mechanistic interpretability method using attribution-guided pruning to discover and correct specific behavior circuits in small-scale LLMs.

Ax Kateryna Lutsai, Pavel Stra\v{n}\'ak 5/6/2026

Page image classification for content-specific data processing

Automated classification system for historical document page images to categorize diverse content types including text, graphics, and layouts.

Ax Ailiang Lin, Zhuoyun Li, Yusong Wang, Kotaro Funakoshi, Manabu Okumura 5/6/2026

Causal2Vec: Improving Decoder-only LLMs as Embedding Models through a Contextual Token

Causal2Vec improves decoder-only LLMs as embedding models using contextual tokens, preserving unidirectional attention while overcoming causal attention representation limitations.

Ax Sung-Hyun Kim, Geum-Hwan Hwang, In-Chang Baek, Seo-Young Lee, Kyung-Joong Kim 5/6/2026

Multi-Objective Instruction-Aware Representation Learning in Procedural Content Generation RL

Instruction-aware representation learning for procedural content generation in RL, improving controllability through better leverage of natural language instructions.

Ax Bokeng Zheng, Jianqiang Zhong, Jiayi Liu, Lei Xue, Xu Chen, Xiaoxi Zhang 5/6/2026

Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks

Decentralized federated fine-tuning approach for foundation models in IoV edge networks under energy constraints with heterogeneous task demands.

Ax Seonglae Cho, Zekun Wu, Adriano Koshiyama 5/6/2026

CorrSteer: Generation-Time LLM Steering via Correlated Sparse Autoencoder Features

CorrSteer steers LLM generation at inference time by selecting interpretable sparse autoencoder features correlated with token correctness, without requiring contrastive datasets.

Ax Weihang Su, Anzhe Xie, Qingyao Ai, Jianming Long, Xuanyi Chen, Jiaxin Mao, Ziyi Ye, Yiqun Liu 5/6/2026

SurGE: A Benchmark and Evaluation Framework for Scientific Survey Generation

SurGE benchmark and evaluation framework for automated scientific survey generation using LLMs, addressing standardization gaps in literature synthesis automation.

Ax Jiaqi Chen, Yanzhe Zhang, Yutong Zhang, Yijia Shao, Diyi Yang 5/6/2026

Generative Interfaces for Language Models

Proposes generative interfaces paradigm to move LLM interactions beyond linear request-response format for more efficient multi-turn, information-dense, and exploratory tasks.

Ax Sanjeeevan Selvaganapathy, Mehwish Nasim 5/6/2026

Confident, Calibrated, or Complicit: Safety Alignment and Ideological Bias in LLM Hate Speech Detection

Study evaluates safety-aligned versus uncensored LLMs on hate speech detection, revealing trade-offs between model censoring and detection performance across political personas.

Ax Shanglin Wu, Lihui Liu, Jinho D. Choi, Kai Shu 5/6/2026

Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Method improves LLM factuality by constructing knowledge graphs at inference time instead of using unstructured text retrieval, enhancing reasoning and reducing irrelevant information influence.

Ax Kai R. Larsen, Sen Yan, Roland M. Mueller, Lan Sang, Mikko R\"onkk\"o, Ravi Starzl, Donald Edmondson 5/6/2026

ALIGNS: Unlocking nomological networks in psychological measurement through a large language model

ALIGNS: LLM-based approach for building nomological networks in psychological measurement to establish construct validity.

Ax Yifan Liu, Yaokun Liu, Zelin Li, Zhenrui Yue, Gyuseok Lee, Ruichen Yao, Yang Zhang, Dong Wang 5/6/2026

Learning Decomposed Contextual Token Representations from Pretrained and Collaborative Signals for Generative Recommendation

Method for generative recommenders learning decomposed contextual token representations combining pretrained and collaborative signals.

Ax Danilo Francati, Yevin Nikhel Goonatilake, Shubham Pawar, Daniele Venturi, Giuseppe Ateniese 5/6/2026