Isolater - Feed

Ax Haoming Xu, Ningyuan Zhao, Yunzhi Yao, Weihong Xu, Hongru Wang, Xinle Deng, Shumin Deng, Jeff Z. Pan, Huajun Chen, Ningyu Zhang 4/8/2026

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Research paper analyzing LLM truthfulness under contextual perturbations, showing self-consistent facts can collapse under mild interference.

Ax Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang 4/8/2026

Can We Predict Before Executing Machine Learning Agents?

Research paper proposing predictive reasoning to replace costly physical execution in ML agent workflows using internalized execution priors.

Ax Hyun Do Jung, Jungwon Choi, Hwiyoung Kim 4/8/2026

ReaMIL: Reasoning- and Evidence-Aware Multiple Instance Learning for Whole-Slide Histopathology

ReaMIL, a multiple instance learning approach for histopathology with reasoning-aware evidence selection under sparsity constraints.

Ax Xiangchen Li, Jiakun Fan, Qingyuan Wang, Dimitrios Spatharakis, Saeid Ghafouri, Hans Vandierendonck, Deepu John, Bo Ji, Ali R. Butt, Dimitrios S. Nikolopoulos 4/8/2026

WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

WISP system for distributed LLM inference at the edge using dynamic drafting and SLO-aware batching to balance workload across networks.

Ax Naeem Paeedeh, Mahardhika Pratama, Ary Shiddiqi, Zehong Cao, Mukesh Prasad, Wisnu Jatmiko 4/8/2026

Cross-Domain Few-Shot Learning for Hyperspectral Image Classification Based on Mixup Foundation Model

Cross-domain few-shot learning for hyperspectral image classification using mixup foundation models to reduce overfitting.

Ax Zhuohong Chen, Zhengxian Wu, Zirui Liao, Shenao Jiang, Hangrui Xu, Yang Chen, Chaokui Su, Xiaoyu Liu, Haoqian Wang 4/8/2026

R3G: A Reasoning--Retrieval--Reranking Framework for Vision-Centric Answer Generation

R3G framework for vision-centric visual question answering using reasoning, retrieval, and reranking to select and integrate relevant images.

Ax Fengxu Yang, Jack D. Evans 4/8/2026

QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

QUASAR, a universal autonomous system integrating LLMs for atomistic simulation and materials science discovery with flexible tool-calling for production workflows.

Ax V\'ictor Yeste, Paolo Rosso 4/8/2026

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration

Study on hierarchical gating and calibration for human value detection from sentences using Schwartz higher-order categories.

Ax Lixiang Fan, Bohao Li, Tao Zou, Junchen Ye, Bowen Du 4/8/2026

Incident-Guided Spatiotemporal Traffic Forecasting

Deep learning and GNN methods for traffic forecasting that incorporate incident data as external disturbances to improve predictions.

Ax Mahmud Ashraf Shamim, Md Moshiur Rahman Raj, Mohamed Hibat-Allah, Paulo T Araujo 4/8/2026

Graph-Theoretic Analysis of Phase Optimization Complexity in Variational Wave Functions for Heisenberg Antiferromagnets

Graph-theoretic analysis of computational complexity in learning ground state phases of Heisenberg antiferromagnets using variational methods.

Ax Ehud Shapiro 4/8/2026

Implementing Grassroots Logic Programs with Multiagent Transition Systems and AI

Derives deterministic operational semantics for Grassroots Logic Programs (GLP), a multiagent concurrent logic programming language for serverless platforms.

Ax Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, Huichao Wang, Jiale Chen, Jianfei Pan, Jieqiong Cao, Jinghao Lin, Kai Wu, Lin Yang, Shengsheng Yao, Tao Chen, Xiaojun Xiao, Xiaozhong Ji, Xu Wang, Yijun He, Zhixiong Yang 4/8/2026

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

MedXIAOHE, a medical multimodal foundation model with entity-aware continual pretraining, achieves state-of-the-art on clinical benchmarks.

Ax David Puertolas Merenciano, Ekaterina Vasyagina, Kevin Zhu, Javier Ferrando, Maheep Chaudhary 4/8/2026

Weight space Detection of Backdoors in LoRA Adapters

Method to detect backdoor attacks in LoRA adapters without test inputs by analyzing weight space, addressing security vulnerabilities in shared model repositories.

Ax Kihoon Son, Hyewon Lee, DaEun Choi, Yoonsu Kim, Tae Soo Kim, Yoonjoo Lee, John Joon Young Chung, HyunJoon Jung, Juho Kim 4/8/2026

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

Study on human-agent co-creative collaboration patterns in shared workspaces, revealing capability gaps for concurrent interaction vs sequential delegation.

Ax Prerna Ravi, Om Gokhale, Suyash Fulay, Eugene Yi, Deb Roy, Michiel Bakker 4/8/2026

Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

Agora platform uses LLMs with AI personas to teach civic competence and consensus-finding skills through deliberative democratic practice.

Ax Anupam Purwar, Aditya Choudhary 4/8/2026

MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

MM-tau-p²: Persona-adaptive evaluation framework for multi-modal LLM agents with dual-control settings exposing user personality and behavior adaptation.

Ax Maria Rosaria Briglia, Simone Facchiano, Paolo Cursi, Alessio Sampieri, Emanuele Rodol\`a, Guido Maria D'Amely di Melendugno, Luca Franco, Fabio Galasso, Iacopo Masi 4/8/2026

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

HyCon: Hyperbolic control mechanism for steering text-to-image models away from unsafe concepts using parallel transport instead of Euclidean adjustments.

Ax Francesco Pio Monaco, Elia Cunegatti, Flavio Vella, Giovanni Iacca 4/8/2026

Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

Frequency-based data curation method for selecting calibration data to preserve LLM performance during post-training pruning and quantization.

Ax Zice Wang, Zhenyu Zhang 4/8/2026

Framing Effects in Independent-Agent Large Language Models: A Cross-Family Behavioral Analysis

Examines prompt framing effects on LLM decision-making in threshold voting tasks across model families under isolated, non-interactive settings.

Ax Dang Nguyen, Harvey Yiyun Fu, Peter West, Ari Holtzman, Chenhao Tan 4/8/2026

Moral Mazes in the Era of LLMs

HR Simulator: Game-based evaluation of LLMs navigating complex workplace social norms like giving feedback and rejecting requests appropriately.

Ax Drake Caraker, Bryan Arnold, David Rhoads 4/8/2026

First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution

Identifies first-mover bias in SHAP explanations from gradient boosting's sequential fitting causing attribution instability under multicollinearity.

Ax Abhinaba Basu, Pavan Chakraborty 4/8/2026

The illusion of reasoning: step-level evaluation reveals decorative chain-of-thought in frontier language models

Step-level faithfulness evaluation shows chain-of-thought reasoning in frontier LLMs is often decorative, post-hoc narrative rather than genuine reasoning.

Ax Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf, Haifeng Ruan, Ridwan Shariffdeen, Abhik Roychoudhury 4/8/2026

Code Review Agent Benchmark

Code Review Agent Benchmark: Dataset and evaluation framework for assessing AI agents' ability to review code quality in generated codebases.

Ax Weimin Liu, Qingkun Li, Jiyuan Qiu, Wenjun Wang, Joshua H. Meng 4/8/2026

DiffAttn: Diffusion-Based Drivers' Visual Attention Prediction with LLM-Enhanced Semantic Reasoning

DiffAttn: Diffusion-based framework for predicting drivers' visual attention using LLM-enhanced semantic reasoning for intelligent vehicles.

Ax Kavindu Herath, Joshua Zhao, Saurabh Bagchi 4/8/2026

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

SABLE: Semantics-aware backdoor attack on federated learning using realistic, in-distribution visual triggers instead of synthetic patterns.

Ax Ziliang Guo, Ziheng Li, Bo Tang, Feiyu Xiong, Zhiyu Li 4/8/2026

MemFactory: Unified Inference & Training Framework for Agent Memory

MemFactory: Unified framework for training and inference of memory-augmented LLM agents with reinforcement learning optimization of memory operations.

Ax Ryszard Tuora, Mateusz Gali\'nski, Micha{\l} Godziszewski, Micha{\l} Karpowicz, Mateusz Czy\.znikiewicz, Adam Kozakiewicz, Tomasz Zi\k{e}tkiewicz 4/8/2026

UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough

Compares GraphRAG with VectorRAG for retrieval-augmented generation, showing simpler vector-based approaches handle chunk relationships effectively.

Ax Sicheng Zuo, Zixun Xie, Wenzhao Zheng, Shaoqing Xu, Fang Li, Hanbing Li, Long Chen, Zhi-Xin Yang, Jiwen Lu 4/8/2026

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

DVGT-2: Vision-Geometry-Action model for autonomous driving using dense 3D geometry instead of language descriptions for planning.

Ax Manoj Parmar 4/8/2026

Safety, Security, and Cognitive Risks in World Models

Analyzes safety, security, and cognitive risks in world models used for autonomous decision-making in robotics and agentic AI systems.

Ax Vardaan Pahuja, Samuel Stevens, Alyson East, Sydne Record, Yu Su 4/8/2026

Automatic Image-Level Morphological Trait Annotation for Organismal Images

Uses sparse autoencoders to automatically annotate morphological traits in biological organism images, automating expert-driven extraction process.

Ax Wei Zou, Mingwen Dong, Miguel Romero Calvo, Shuaichen Chang, Jiang Guo, Dongkyu Lee, Xing Niu, Xiaofei Ma, Yanjun Qi, Jiarong Jiang 4/8/2026

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Demonstrates environment-injected memory poisoning attacks on LLM-based web agents through contamination persisting across sessions without direct memory access.

Ax Adrienne Deganutti, Elad Hirsch, Haonan Zhu, Jaejung Seol, Purvanshi Mehta 4/8/2026

Graphic-Design-Bench: A Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks

GraphicDesignBench: First comprehensive benchmark for evaluating AI on professional graphic design tasks including layout translation and typographic rendering.

Ax Gregory N. Frank 4/8/2026

How Alignment Routes: Localizing, Scaling, and Controlling Policy Circuits in Language Models

Identifies sparse routing mechanism in alignment-trained language models where gate attention heads trigger refusal responses, validated across 9 models from 6 labs.

Ax Gabriel Sarch, Linrong Cai, Qunzhong Wang, Haoyang Wu, Danqi Chen, Zhuang Liu 4/8/2026

Vero: An Open RL Recipe for General Visual Reasoning

Vero: Open-source family of vision-language models matching proprietary systems on visual reasoning tasks using reinforcement learning with public recipes and data.

HN ZayaanBhan123 4/8/2026

ContextSync – Sync VS Code AI Context via Obsidian/OneDrive

Developer tool: ContextSync syncs VS Code AI chat history via Obsidian/OneDrive to maintain context across team LLM sessions.

HN bmv3502 4/8/2026

Show HN: Silkwave Voice – AI Notetaker Using Apple Intelligence's ChatGPT

macOS tool: on-device transcription with ChatGPT summaries for meetings and audio. No cloud storage, Apple Intelligence integration.

HN pritopian 4/8/2026

Comprehensive Benchmark for Evaluating AI on Graphic Design Tasks

arXiv research benchmark for evaluating AI performance on graphic design tasks, measuring model capabilities in visual design domains.

HN jsahasi 4/8/2026

VitalNexa – AI health agent that reads your actual lab results and wearable data

VitalNexa is an AI health agent that analyzes lab results and wearable data to provide personalized health recommendations and biological age scoring.

HN DaibinThink 4/8/2026

AI is structurally trained to lie. I built a protocol to break it

Author presents a protocol addressing AI's structural tendency to agree and sound authoritative rather than hallucinating, causing subtle reality distortions in outputs.

HN vibeagentmaking 4/8/2026

Every Barrier Between AI Agents and Autonomy – A Practical Map

Technical analysis of practical constraints preventing AI agents from autonomous operation, mapping barriers and their severity.

HN mjyut 4/8/2026

LLM scraper bots are overloading acme.com's HTTPS server

Practical analysis of operational and technical barriers preventing autonomous AI agents, mapping constraints in agent economy.

HN tank-34 4/8/2026

Pydantic-resolve – define relationships once, reuse across REST, GraphQL and MCP

Pydantic-resolve is a declarative data assembly library using DataLoader pattern to eliminate N+1 queries across REST, GraphQL, and MCP protocols.

HN surprisetalk 4/8/2026

OpenAI says its new model GPT-2 is too dangerous to release (2019)

2019 article about OpenAI's decision not to release GPT-2 over safety concerns (title only, no content).

HN zhangchen 4/8/2026

Show HN: Replaced Neo4j with pure vector search for Graph RAG

Case study of server overload caused by LLM scraper bots making excessive HTTPS requests to acme.com domain.

HN erickhill 4/8/2026

Sam Altman on Building the Future of AI [video]

Video of Sam Altman discussing AI development (title only, no content provided).

HN gmays 4/8/2026

Larger and more instructable language models become less reliable

Research finding that larger and more instructable language models show decreased reliability (title only).

HN evo_9 4/8/2026

InstaMed – Oral Dissolving Peptides with InstaRelease Technology

Static analysis tool detecting ReDoS vulnerabilities in Python regular expressions with automatic fixes.

HN b-man 4/8/2026

Release Please

Marketing content for oral dissolving peptides supplement product.

HN danielmateo773 4/8/2026

Show HN: Omni Voice – AI Voice Cloning and Text-to-Speech Platform

Omni Voice is a multilingual AI voice cloning and text-to-speech platform supporting 646 languages with unified model.

HN jinqueeny 4/8/2026

Drive9, a network drive with built-in semantic search

Drive9 is agent-native data infrastructure providing filesystem-like interface with semantic search, embedding, and full-text indexing for AI agents.