Isolater - Feed

Ax Ernesto Garcia, Paola Bermolen, Matthieu Jonckheere, Seva Shneer 5/7/2026

Efficiency of Parallel and Restart Exploration Strategies in Model Free Stochastic Simulations

Analysis of parallelization and restart mechanisms for exploration in model-free reinforcement learning and rare event estimation.

Ax Yazan Youssef, Paulo Ricardo Marques de Araujo, Aboelmagd Noureldin, Sidney Givigi 5/7/2026

RouteFormer: A Transformer-Based Routing Framework for Autonomous Vehicles

Transformer-based routing framework for single-agent NP-hard combinatorial optimization in dynamic IoT network environments.

Ax Henry Peng Zou, Wei-Chieh Huang, Yaozu Wu, Jizhou Guo, Yankai Chen, Chunyu Miao, Hoang Nguyen, Yue Zhou, Weizhi Zhang, Liancheng Fang, Hanrong Zhang, Fangxin Wang, Pengfei Zhang, Huacan Wang, Langzhou He, Yangning Li, Dongyuan Li, Renhe Jiang, Xue Liu, Philip S. Yu 5/7/2026

LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey

Survey of LLM-based human-agent collaboration systems addressing reliability, complexity, safety and trustworthiness challenges in autonomous agents.

Ax Aidan Gleich, Eric Laber, Alexander Volfovsky 5/7/2026

Scalable Policy Maximization Under Network Interference

Multi-armed bandit algorithms for optimal policy learning under network interference in sequential intervention assignment.

Ax Matin Aghaei, Lingfeng Zhang, Mohammad Ali Alomrani, Mahdi Biparva, Yingxue Zhang 5/7/2026

When Engineering Outruns Intelligence: Rethinking Instruction-Guided Navigation

Re-evaluates instruction-guided navigation systems, analyzing geometry vs LLM contributions with training-free variants for robot navigation.

Ax Kaiwen Zheng, Yuji Wang, Qianli Ma, Huayu Chen, Jintao Zhang, Yogesh Balaji, Jianfei Chen, Ming-Yu Liu, Jun Zhu, Qinsheng Zhang 5/7/2026

Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency

Scaling continuous-time consistency models for fast diffusion in large-scale text-to-image and video generation tasks.

Ax Willem Diepeveen, Melanie Weber 5/7/2026

Iso-Riemannian Optimization on Learned Data Manifolds

Optimization framework for performing downstream tasks on learned Riemannian data manifolds with low-dimensional structure.

Ax Qiming Bao, Xiaoxuan Fu, Michael Witbrock 5/7/2026

Conflict-Aware Fusion: Mitigating Logic Inertia in Large Language Models via Structured Cognitive Priors

Framework exposing 'Logic Inertia' problem in LLMs where models fail on structural perturbations of rule-based systems.

Ax Buu Phan, Ashish Khisti, Karen Ullrich 5/7/2026

Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation

Algorithm for computing token likelihood ratios between language models with different tokenizers for knowledge distillation.

Ax Pere Martra 5/7/2026

Fragile Knowledge, Robust Instruction-Following: The Width Pruning Dichotomy in Llama-3.2

Study on structured width pruning of Llama models showing trade-offs between knowledge retention and instruction-following capability.

Ax Renyang Liu, Kangjie Chen, Han Qiu, Jie Zhang, Kwok-Yan Lam, Tianwei Zhang, See-Kiong Ng 5/7/2026

SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models

Method for unlearning unsafe concepts in image generation models through prompt embedding redirection.

Ax Luze Sun, Alina Oprea, Eric Wong 5/7/2026

Syntax- and Compilation-Preserving Evasion of LLM Vulnerability Detectors

Research on evasion attacks against LLM-based code vulnerability detectors using syntax-preserving transformations on C/C++ benchmarks.

Ax I-Chun Arthur Liu, Krzysztof Choromanski, Sandy Huang, Connor Schenck 5/7/2026

CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining

Contrastive learning framework (CLAMP) for pretraining 3D multi-view representations in robotic manipulation policies via action-conditioned self-supervision.

Ax Yining Lu, Meng Jiang 5/7/2026

Uncovering Cross-Objective Interference in Multi-Objective Alignment

Systematic study of cross-objective interference in multi-objective LLM alignment, showing performance improvements on some objectives cause degradation on others.

Ax Jonas K\"ubler, Kailash Budhathoki, Matth\"aus Kleindessner, Xiong Zhou, Junming Yin, Ashish Khetan, George Karypis 5/7/2026

When LLMs get significantly worse: A statistical approach to detect model degradations

Statistical method to detect significant model degradations in LLMs after optimization techniques like quantization, with robustness analysis at zero temperature.

Ax Lennart R\"ostel, Berthold B\"auml 5/7/2026

Denoising Particle Filters: Learning State Estimation with Single-Step Objectives

Particle filtering algorithm for robotic state estimation trained with single-step objectives rather than end-to-end sequence modeling.

Ax Bj\"orn Hoppmann, Christoph Scholz 5/7/2026

Meta-Learning and Meta-Reinforcement Learning -- Tracing the Path towards DeepMind's Adaptive Agent

Survey of meta-learning and meta-reinforcement learning techniques enabling rapid adaptation to new tasks with minimal data, tracing path to adaptive agents.

Ax Jon M Laurent, Albert Bou, Michael Pieler, Conor Igoe, Alex Andonian, Siddharth Narayanan, James Braza, Alexandros Sanchez Vassopoulos, Jacob L Steenwyk, Blake Lash, Andrew D White, Samuel G Rodriques 5/7/2026

LABBench2: An Improved Benchmark for AI Systems Performing Biology Research

LABBench2 benchmark for evaluating AI systems performing biology research, covering foundation models, agentic hypothesis generation, and autonomous labs.

Ax Teodor {\AA}strand, Marcus Binder Nilsen, Iasonas Tsaklis, Tuhfe G\"o\c{c}men, Pierre-Elouan R\'ethor\'e, Nikolay Dimitrov 5/7/2026

Load constrained wind farm flow control through multi-objective multi-agent reinforcement learning

Multi-agent reinforcement learning framework for wind farm flow control balancing power output with structural load constraints using Independent Soft Actor-Critic.

Ax Kexin Chu 5/7/2026

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

Comprehensive survey of security threats and defenses in LLM-based AI agents, organizing attacks by architectural layers including memory, tools, and multi-agent interactions.

Ax Shouren Wang, Wang Yang, Chuang Ma, Debargha Ganguly, Vikash Singh, Chaoda Song, Xinpeng Li, Xianxuan Long, Vipin Chaudhary, Xiaotian Han 5/7/2026

Path-Lock Expert: Separating Reasoning Mode in Hybrid Thinking via Architecture-Level Separation

Proposes Path-Lock Expert architecture to cleanly separate reasoning and non-reasoning modes in hybrid-thinking LLMs by isolating parameters, reducing reasoning leakage in inference.

Ax Simon Dennis, Michael Diamond, Rivaan Patil, Kevin Shabahang, Hao Guo 5/7/2026

In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks

Empirical study showing simple system-prompt self-orchestration outperforms agent orchestration frameworks for procedural LLM tasks.

Ax Anjie Liu, Ziqin Gong, Yan Song, Yuxiang Chen, Xiaolong Liu, Hengtong Lu, Kaike Zhang, Chen Wei, Jun Wang 5/7/2026

The Perceptual Bandwidth Bottleneck in Vision-Language Models: Active Visual Reasoning via Sequential Experimental Design

Study of perceptual bandwidth bottleneck in VLMs proposing active visual reasoning via sequential experimental design for fine-grained reasoning.

Ax Yiheng Zhang, Kaiyan Zhao, Shaowu Wu, Yiming Wang, Jiajun Wu, Leong Hou U, Steve Drew, Xiaoguang Niu 5/7/2026

Anon: Extrapolating Adaptivity Beyond SGD and Adam

Anon optimizer improves adaptation across diverse landscapes by decoupling pre-conditioner adaptivity, generalizing better than Adam on various architectures.

Ax Yiheng Zhang, Yiming Wang, Kaiyan Zhao, Zhenglin Wan, Jiayu Chen, Leong Hou U 5/7/2026

ANO: A Principled Approach to Robust Policy Optimization

ANO optimizer addresses PPO's hard clipping limitation with principled design space for robust policy optimization in RL and LLM alignment.

Ax Cheng Qian, Hyeonjeong Ha, Jiayu Liu, Jeonghwan Kim, Jiateng Liu, Bingxuan Li, Aditi Tiwari, Dwip Dalal, Zhenhailong Wang, Xiusi Chen, Mahdi Namazifar, Yunzhu Li, Heng Ji 5/7/2026

CreativityBench: Evaluating Agent Creative Reasoning via Affordance-Based Tool Repurposing

CreativityBench benchmark evaluates LLM creative reasoning through affordance-based tool repurposing and non-canonical object usage.

Ax Dongyoung Kim, Huiwon Jang, Myungkyu Koo, Suhyeok Jang, Taeyoung Kim, Beomjun Kim, Byungjun Yoon, Changsung Jang, Daewon Choi, Dongsu Han, Donguk Lee, Heeseung Kwon, Hojin Jeon, Jaehyun Kang, Jaekyoung Bae, Jihyuk Lee, Jimin Lee, John Won, Joonwoo Ahn, Junhyeong Park, Junyoung Sung, Kyungmin Lee, Minseong Han, Minsung Yoon, Sejune Joo, Seonil Son, Seungcheol Park, Seunggeun Cho, Seungjun Moon, Seungku Kim, Yonghoon Dong, Yongjin Cho, Youngchan Kim, Chang Hwan Kim, Dohyeon Kim, Heecheol Kim, Heewon Lee, Hensen Ahn, Hyungkyu Ryu, Hyunsoo Choi, Hyunsoo Shin, Jaeheon Jung, Jaewoo Kim, Jinwook Kim, Joochul Chang, Joonsoo Kim, Junghun Park, Jungwoo Park, Junho Cho, Junhyeok Park, Junwon Lee, Kangwook Lee, Kwanghoon Kim, Kyoungwhan Choe, Manoj Bhadu, Nayoung Oh, Sangjun Kim, Sangwoo Kim, Seunghoon Shim, Seunghyun Kim, Seungjun Lee, Seungyup Ka, Sungryol Yang, Wook Jung, Yashu Shukla, Yeonjae Lee, Yeonwoo Bae, Jinwoo Shin 5/7/2026

RLDX-1 Technical Report

RLDX-1: Vision-Language-Action robotic policy model addressing limitations in motion awareness, memory, and physical sensing for complex tasks.

HN thunderbong 5/7/2026

How AI Works Under the Hood – LLMs Explained with Code

Technical explanation of how LLMs work: tokenization, embeddings, transformer layers, self-attention, token prediction. Educational deep dive with code examples.

HN oldfuture 5/7/2026

0xBitNet

0xBitNet enables running BitNet b1.58 ternary LLMs via WebGPU in browsers and native apps using custom WGSL kernels with TypeScript, Rust, and Python bindings.

HN simantakDabhade 5/7/2026

An SDK to accept payments from Agents

SDK enabling AI agents to pay for API calls using HTTP 402 protocol. Payment infrastructure allowing agents to autonomously purchase services.

HN cojj25 5/7/2026

Show HN: Dreamwork – a job search site I made after Indeed fired my pregnant wif

Job search platform using AI to improve hiring experience beyond mass applications. LLM-based recruitment tool with Google Embeddings integration.

HN pella 5/7/2026

Hunk: Review-first terminal diff viewer for agentic coders

Hunk: terminal diff viewer for agent-authored changesets built on OpenTUI with review-first workflow for agentic coders, supports agent-context and skill paths.

HN TeddyCrep 5/7/2026

Show HN: Sqlalchemy-Redshift Is Back: Reviving a Critical Python Data Library

sqlalchemy-redshift Python library revived for Amazon Redshift SQLAlchemy dialect with PyPI distribution supporting redshift_connector and psycopg2.

BL 5/7/2026

Simplex rethinks software development with Codex

Simplex uses ChatGPT and Codex to accelerate software development, reporting 70% faster screen development and 40% faster design iterations.

HN paulpauper 5/6/2026

Kicking the Tires: A Voluntary Path to Pre-Deployment AI Vetting

Trump administration considering voluntary frontier AI model vetting system using existing CAISI/CISA tools, enabling labs to provide government early access to models with cyber capabilities.

HN nateb2022 5/6/2026

On-Policy [LLM] Distillation (2025)

Research on on-policy LLM distillation (2025): training smaller domain-expert models outperforming larger generalist models through stacked training stages including perception, retrieval, planning, and execution.

HN justacatbot 5/6/2026

Show HN: Gpu.fund, live GPU cloud rental prices

gpu.fund tracks GPU cloud rental prices across providers (H100s, 4090s, 3090s) to help users find cheapest available hardware for training and inference.

HN sarh543 5/6/2026

We Built an AI Recruiter That Conducts Interviews on Google Meet

Brydg is an AI hiring platform that automates interview scheduling, resume ranking, and offer management via Google Meet.

HN koolba 5/6/2026

Anthropic raises Claude Code usage limits, credits new deal with SpaceX

Anthropic doubles Claude Code usage limits following a compute deal with SpaceX for increased developer access.

HN vednig 5/6/2026

Google Chrome downloads 4GB AI model to your device without permission

Security researcher reports Google Chrome silently downloads 4GB on-device AI model without user consent.

HN lexandstuff 5/6/2026

AI-Induced Cognitive Atrophy

Study shows students using LLMs for programming assignments experience reduced learning and understanding despite perceived improvement.

HN zachdotai 5/6/2026

AI evaluation startup Braintrust confirms breach

Braintrust, an AI evaluation startup, confirms AWS breach exposing customer API keys for cloud AI model access.

HN DaveHN_2026 5/6/2026

CrustAI – Private local AI assistant

Private self-hosted AI assistant running locally on user machines via Ollama, integrating with messaging platforms without cloud data transmission.

HN doener 5/6/2026

Terminal coding agent for DeepSeek V4

Terminal coding agent for DeepSeek V4 that streams reasoning blocks, edits local code with approval gates, and auto-selects models.

HN Gabriela_OS_TX 5/6/2026

Kestrel: Open-source sovereign AI agent framework

Production-ready open-source framework for sovereign AI agents with cryptographic identity, persistent memory, and constitutional governance.

HN surprisetalk 5/6/2026

Connectors Now on Grok Web

Grok introduces Connectors, integrations enabling end-to-end automation with email, slides, calendar, and spreadsheets without manual copy-pasting.

HN jvaill 5/6/2026

Show HN: I built an open source background agent inspired by Ramp Inspect

Open-source self-hostable background agent platform inspired by Ramp Inspect, enabling AI agents to work autonomously on GitHub repos.

HN xkoda 5/6/2026

I built a game where AI agents compete to ship code; live WASM every 5 minutes

Aion is a collaborative coding game where players and AI agents propose diffs compiled live every 5 minutes via MCP-enabled agents.

HN ndhandala 5/6/2026

Genosyn – Run autonomous companies with AI

Genosyn is an open-source self-hostable platform for running autonomous AI companies with AI employees having constitutions, skills, and scheduled routines.

HN fmerian 5/6/2026

Show HN: Pay.sh – Discover, access, and pay for any API autonomously

Pay.sh is a payment layer for HTTP APIs handling x402 stablecoin payment challenges, enabling autonomous CLI tool access to gated endpoints.