Isolater - Feed

Ax Bernd Bohnet, Michael C. Mozer, Kevin Swersky, Wil Cunningham, Aaron Parisi, Kathleen Kenealy, Noah Fiedel 4/6/2026

Analysis of Optimality of Large Language Models on Planning Problems

Analyzes frontier LLMs on classic AI planning problems, examining whether models reason optimally or rely on heuristic strategies in Blocksworld domain.

Ax Yunhao Feng, Yifan Ding, Yingshui Tan, Xingjun Ma, Yige Li, Yutao Wu, Yifeng Gao, Kun Zhai, Yanming Guo 4/6/2026

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

Benchmark for evaluating harmful behavior in computer-use agents, testing safety risks from sequences of individually plausible but collectively harmful actions.

Ax Kehan Jiang, Haonan Dong, Zhaolu Kang, Zhengzhou Zhu, Guojie Song 4/6/2026

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

Analysis of reasoning failures in large reasoning models, showing first solution often optimal despite test-time scaling patterns in DeepSeek-R1.

Ax Ka Yiu Lee, Yuxuan Huang, Zhiyuan He, Huichi Zhou, Weilin Luo, Kun Shao, Meng Fang, Jun Wang 4/6/2026

InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking

Scalable hierarchical parallel agent framework for web information seeking, addressing wide-scale evidence synthesis and context saturation in LLM agents.

Ax Qianshan Wei, Yishan Yang, Siyi Wang, Jinglin Chen, Binyu Wang, Jiaming Wang, Shuang Chen, Zechen Li, Yang Shi, Yuqi Tang, Weining Wang, Yi Yu, Chaoyou Fu, Qi Li, Yi-Fan Zhang 4/6/2026

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Benchmark evaluating multimodal LLM agents with tool integration capabilities including visual expansion and web search through agentic reasoning.

Ax Fabian Gloeckle, Ahmad Rammal, Charles Arnal, Remi Munos, Vivien Cabannes, Gabriel Synnaeve, Amaury Hayat 4/6/2026

Automatic Textbook Formalization

AI system automatically formalizes 500+ page graduate-level algebraic combinatorics textbook to Lean, achieving 130K lines of formal code.

Ax Yunfei Bai, Amit Dhanda, Shekhar Jain 4/6/2026

Chart-RL: Policy Optimization Reinforcement Learning for Enhanced Visual Reasoning in Chart Question Answering with Vision Language Models

Reinforcement learning approach to improve visual reasoning in chart question answering using vision language models with policy optimization.

Ax Maximiliano Armesto, Christophe Kolb 4/6/2026

Coupled Control, Structured Memory, and Verifiable Action in Agentic AI (SCRAT -- Stochastic Control with Retrieval and Auditable Trajectories): A Comparative Perspective from Squirrel Locomotion and Scatter-Hoarding

Framework for agentic AI emphasizing control, memory, and verifiable action under partial observability, inspired by squirrel ecology comparisons.

Ax Jakob Prange, Nathan Schneider, Lingpeng Kong 4/6/2026

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

Evaluates linguistic graph representations combined with pretrained Transformers for language modeling, comparing semantic and syntactic formalisms.

Ax Jakob Prange, Man Ho Ivy Wong 4/6/2026

Reanalyzing L2 Preposition Learning with Bayesian Mixed Effects and a Pretrained Language Model

Bayesian and neural models analyzing Chinese learners' English preposition comprehension, using pretrained language models for linguistic analysis.

Ax Jakob Prange, Emmanuele Chersoni 4/6/2026

Empirical Sufficiency Lower Bounds for Language Modeling with Locally-Bootstrapped Semantic Structures

Research on language modeling with predicted semantic structure, establishing empirical lower bounds for performance improvements using binary vector representations.

Ax Mohammad Rezaei, Jens Lehmann, Sahar Vahdati 4/6/2026

LLM Reasoning with Process Rewards for Outcome-Guided Steps

Reinforcement learning approach using process rewards to provide intermediate feedback for multi-step mathematical reasoning in LLMs.

Ax Roy Rinberg, Annabelle Michael Carrell, Simon Henniger, Nicholas Carlini, Keri Warr 4/6/2026

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

Study of LLM-generated text compression using domain-adapted LoRA and arithmetic coding, characterizing lossless and lossy compression frontiers.

Ax Mengzhou Wu, Yuzhe Guo, Yuan Cao, Haochuan Lu, Songhe Zhu, Pingzhe Qu, Xin Chen, Kang Qin, Zhongpu Wang, Xiaode Zhang, Xinyi Wang, Wei Dai, Gang Cao, Yuetang Deng, Zhi Gong, Dezhi Ran, Linyi Li, Wei Yang, Tao Xie 4/6/2026

UI-Oceanus: Scaling GUI Agents with Synthetic Environmental Dynamics

Framework for scaling GUI agents using synthetic environmental dynamics and self-supervised learning from ground-truth interaction feedback.

Ax Tianyu Liu, Sihan Jiang, Fan Zhang, Kunyang Sun, Teresa Head-Gordon, Hongyu Zhao 4/6/2026

DrugPlayGround: Benchmarking Large Language Models and Embeddings for Drug Discovery

Benchmark for evaluating LLMs and embeddings on drug discovery tasks including hypothesis generation and candidate prioritization.

Ax Yiqin Yang, Hao Hu, Yihuan Mao, Jin Zhang, Chengjie Wu, Yuhua Jiang, Xu Yang, Runpeng Xie, Yi Fan, Bo Liu, Yang Gao, Bo Xu, Chongjie Zhang 4/6/2026

OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration

Offline preference-based RL method improving query efficiency by addressing exploration and preference ranking within existing datasets.

Ax Venkatakrishna Reddy Oruganti 4/6/2026

Differentiable Symbolic Planning: A Neural Architecture for Constraint Reasoning with Learned Feasibility

Neural architecture performing discrete symbolic constraint reasoning while maintaining differentiability for planning and feasibility checking.

Ax Sophie Weidmann, Fernando Castor 4/6/2026

An Initial Exploration of Contrastive Prompt Tuning to Generate Energy-Efficient Code

Study using contrastive prompt tuning to optimize LLMs for generating energy-efficient code supporting Green Software Development.

Ax Thomas Pravetz 4/6/2026

Prism: Policy Reuse via Interpretable Strategy Mapping in Reinforcement Learning

Framework for zero-shot transfer between RL agents using interpretable discrete concepts validated through causal intervention.

Ax Gaoxiang Cao, Wenke Yuan, Yunpeng Hou, Huasen He, Quan Zheng, Jian Yang 4/6/2026

Dynamic Mask Enhanced Intelligent Multi-UAV Deployment for Urban Vehicular Networks

Dynamic UAV deployment system for vehicular networks using Q-learning with action masking to enhance reliability in urban environments.

Ax May Lynn Reese, Markela Zeneli, Mindy Ng, Jacob Haimes, Andreea Damien, Elizabeth Stade 4/6/2026

Using LLM-as-a-Judge/Jury to Advance Scalable, Clinically-Validated Safety Evaluations of Model Responses to Users Demonstrating Psychosis

Framework using LLMs as judges to evaluate safety of model responses for users with psychosis, addressing clinical validation gaps in mental health.

Ax Raul Suzuki, Rodrigo Moreira, Pedro Henrique A. Damaso de Melo, Larissa F. Rodrigues Moreira, Fl\'avio de Oliveira Silva 4/6/2026

TRACE: Traceroute-based Internet Route change Analysis with Ensemble Learning

ML pipeline using ensemble learning to detect internet routing instability from traceroute latency data without control plane information.

Ax Varshith Madishetty 4/6/2026

CIPHER: Conformer-based Inference of Phonemes from High-density EEG

Conformer-based model for decoding speech information from high-density EEG using dual-pathway architecture with ERP and broadband features.

Ax Dun Yuan, Fuyuan Lyu, Ye Yuan, Weixu Zhang, Bowei He, Jiayi Geng, Linfeng Du, Zipeng Sun, Yankai Chen, Changjiang Han, Jikun Kang, Alex Chen, Haolun Wu, Xue Liu 4/6/2026

Beyond Message Passing: Toward Semantically Aligned Agent Communication

Analysis of agent communication protocols for LLM systems organized into communication, syntactic, and semantic layers with systematic evaluation of 18 protocols.

Ax Constantina Chatzieleftheriou, Eirini Liotou 4/6/2026

A Survey on AI for 6G: Challenges and Opportunities

Survey of AI and ML applications in 6G networks covering high data rates, low latency, and emerging applications like autonomous systems.

Ax Austin Veselka 4/6/2026

Internalized Reasoning for Long-Context Visual Document Understanding

Synthetic data pipeline for reasoning in long-document visual understanding that generates thinking traces for improved LLM performance on enterprise documents.

Ax Zhenning Yang, Kaden Gruizenga, Tongyuan Miao, Patrick Tser Jern Kon, Hui Guan, Ang Chen 4/6/2026

Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis

Framework addressing underspecified natural language requests for cloud infrastructure code generation using LLMs with multi-level disambiguation.

Ax Xinyu Zhou, Yinfeng Yu 4/6/2026

Audio Spatially-Guided Fusion for Audio-Visual Navigation

Audio-visual navigation system for autonomous agents to localize and navigate toward vocalizing targets in 3D environments.

Ax Xuejian Zhang, Ruisi He, Minseok Kim, Inocent Calist, Mi Yang, Ziyi Qi 4/6/2026

Environment-Aware Channel Prediction for Vehicular Communications: A Multimodal Visual Feature Fusion Framework

Deep learning framework for predicting wireless channel characteristics in vehicular 6G communications using visual feature fusion.

Ax Anderson Augusma (UGA, LIG, M-PSI), Dominique Vaufreydaz (LIG, M-PSI), F\'ed\'erique Letu\'e (SVH) 4/6/2026

Variational Encoder--Multi-Decoder (VE-MD) for Privacy-by-functional-design (Group) Emotion Recognition

Privacy-preserving group emotion recognition model using variational encoder-multi-decoder architecture without per-person feature extraction.

Ax Scott Piersall, Yang Gao, Shenyang Liu, Liqiang Wang 4/6/2026

Improving MPI Error Detection and Repair with Large Language Models and Bug References

Approach using LLMs to detect and repair errors in MPI code for high-performance computing and distributed training frameworks.

Ax Yuchen Guo, Junli Gong, Hongmin Cai, Yiu-ming Cheung, Weifeng Su 4/6/2026

LumiVideo: An Intelligent Agentic System for Video Color Grading

LumiVideo agentic system mimicking professional video colorists' workflows with interpretable iterative control for automated color grading.

Ax Zachary Bogorad, Ibrahim Elsharkawy, Yonatan Kahn, Andrew J. Larkoski, Noam Levi 4/6/2026

Generative models on phase space

Research on deep generative models (diffusion, flow matching) for high-dimensional distributions on constrained submanifolds in physics data.

Ax Timothy Gould, Sidike Paheding 4/6/2026

Self-Directed Task Identification

Self-Directed Task Identification framework enabling models to autonomously identify target variables in zero-shot learning without pre-training.

Ax Kevin Song 4/6/2026

PlayGen-MoG: Framework for Diverse Multi-Agent Play Generation via Mixture-of-Gaussians Trajectory Prediction

Framework using Mixture-of-Gaussians trajectory prediction for diverse multi-agent play generation in team sports.

Ax Shramana Dey, Zahir Khan, T. A. PramodKumar, B. Uma Shankar, Ashis K. Dhara, Ramachandran Rajalakshmi, Rajiv Raman, Sushmita Mitra 4/6/2026

Managing Diabetic Retinopathy with Deep Learning: A Data Centric Overview

Survey of deep learning approaches for diabetic retinopathy detection addressing dataset limitations and geographic diversity issues.

Ax Aaditya Naik, Guruprerana Shabadi, Rajeev Alur, Mayur Naik 4/6/2026

Do We Need Frontier Models to Verify Mathematical Proofs?

Research investigating whether frontier reasoning models are necessary for mathematical proof verification versus smaller LLM judges.

Ax Nishit Asnani, Rohan Badlani 4/6/2026

Skeleton-based Coherence Modeling in Narratives

NLP research on skeleton-based coherence modeling for narrative generation and detection of incoherent story structures.

Ax Zonghan Li, Feng Ji 4/6/2026

When simulations look right but causal effects go wrong: Large language models as behavioral simulators

Empirical evaluation of LLMs as behavioral simulators for predicting intervention effects across 11 climate-psychology interventions using 59,508 participants.

Ax Jun-Sik Yoo 4/6/2026

On the Geometric Structure of Layer Updates in Deep Language Models

Research studying geometric structure of layer-wise updates in deep language models across Transformer and state-space architectures.

Ax Mengtian Li, Yuwei Lu, Feifei Li, Chenqi Gan, Zhifeng Xie, Xi Wang 4/6/2026

VERTIGO: Visual Preference Optimization for Cinematic Camera Trajectory Generation

VERTIGO system for cinematic camera trajectory generation with visual preference optimization for realistic shot composition.

Ax Haodong Xie, Yujun Cai, Rahul Singh Maharjan, Yiwei Wang, Federico Tavella, Angelo Cangelosi 4/6/2026

Hierarchical, Interpretable, Label-Free Concept Bottleneck Model

Hierarchical Interpretable Label-Free Concept Bottleneck Model enabling interpretability at multiple abstraction levels unlike single-level existing CBMs.

Ax Valeria Martin, K. Brent Venable, Derek Morgan 4/6/2026

Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AI

Diffusion-based foundation model generates synthetic satellite imagery for wildfire detection without task-specific retraining.

Ax Kiran Yalamanchi, Shivam Barwey, Ibrahim Jarrah, Pinaki Pal 4/6/2026

A Multimodal Vision Transformer-based Modeling Framework for Prediction of Fluid Flows in Energy Systems

Transformer-based framework using Vision Transformer for predicting fluid flows in energy systems, applied to gas injection phenomena.

Ax Samita Bai, Hamed Jelodar, Tochukwu Emmanuel Nwankwo, Parisa Hamedi, Mohammad Meymani, Roozbeh Razavi-Far, Ali A. Ghorbani 4/6/2026

Automated Malware Family Classification using Weighted Hierarchical Ensembles of Large Language Models

Zero-shot malware family classification using weighted hierarchical ensembles of LLMs, avoiding need for labeled datasets and handcrafted features.

Ax Joong Ho Choi, Jiayang Zhao, Avani Appalla, Himansh Mukesh, Dhwanil Vasani, Boyi Qian 4/6/2026

Token-Efficient Multimodal Reasoning via Image Prompt Packaging

Image Prompt Packaging method to reduce token costs in multimodal LLMs by embedding structured text into images, benchmarked across frontier models.

Ax Md. Sajeebul Islam Sk., Md. Mehedi Hasan Shawon, Md. Golam Rabiul Alam 4/6/2026

An Explainable Vision-Language Model Framework with Adaptive PID-Tversky Loss for Lumbar Spinal Stenosis Diagnosis

Vision-language model for lumbar spinal stenosis diagnosis from MRI with adaptive loss function for class imbalance handling.

Ax Roland M\"uhlenbernd 4/6/2026

Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic Prompting

Study of social meaning in LLMs, introducing calibration metrics and pragmatic prompting strategies to improve quantitative approximation of human reasoning.

Ax Rushabha Balaji, Kuan-Lin Chen, Danijela Cabric, Bhaskar D. Rao 4/6/2026

Sparse Bayesian Learning Algorithms Revisited: From Learning Majorizers to Structured Algorithmic Learning using Neural Networks

Unified framework for deriving sparse Bayesian learning algorithms using neural networks and majorizer learning.

Ax Darya Kaviani, Alp Eren Ozdarendeli, Jinhao Zhu, Yu Ding, Raluca Ada Popa 4/6/2026

Opal: Private Memory for Personal AI

System for private long-term memory in personal AI using trusted hardware and oblivious RAM to hide data access patterns from providers.