Isolater - Feed

BL 5/15/2026

Gemini 3.5: frontier intelligence with action

Gemini 3.5 model family optimized for agentic workflows and complex long-horizon tasks. 3.5 Flash released with frontier performance for agents and coding.

HN FranckDernoncou 5/15/2026

Orthrus-Qwen3: up to 7.8×tokens/forward on Qwen3, identical output distribution

Orthrus-Qwen3: Dual-architecture framework achieving 7.8× speedup in token generation with diffusion models while maintaining autoregressive output fidelity.

HN Bender 5/15/2026

Git is unprepared for the AI coding tsunami

GitHub infrastructure challenges from AI agent activity surge; developer discusses migration to alternative platforms.

HN trentisiete 5/15/2026

A tmux layer for coordinating multiple free tier coding-agent CLIs

Endy: tmux control plane coordinating multiple free-tier coding-agent CLIs (Gemini, OpenCode, CommandCode, Hermes) to hand off tasks.

HN dnh44 5/15/2026

Show HN: A seed prompt that bootstraps a custom knowledge-base system

Custom knowledge-base system bootstrapped via LLM seed prompts, converting markdown directories to searchable HTML interfaces.

HN yogthos 5/15/2026

Different models solve number-theory race problem

Benchmark comparing different LLM models solving number-theory problems with real-time competitive ranking by solution speed.

HN reconnecting 5/15/2026

Wikipedia: Writing articles with LLMs

Wikipedia policy prohibiting LLM-generated article content except for copyedits and translations to maintain content quality standards.

HN fesens 5/15/2026

HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)

HWE Bench: benchmark evaluating LLM performance designing RISC-V CPUs with formal verification; GPT-4 designs outperform reference implementations.

HN cbovis 5/15/2026

The Coding Harness Behind GitHub Copilot in VS Code

GitHub Copilot in VS Code: Architecture of the coding harness layer managing context, tools, agent loops, and tool interpretation.

HN theletterf 5/15/2026

Show HN: Claude64, a Commodore 64 client for Claude

Claude64: Commodore 64 chat client for Claude API using emulated RS-232 serial connection and streaming responses.

HN andriioliinyk 5/15/2026

Case study: AI regulatory monitoring with structured outputs for legal review

LLM-based compliance engine tracking regulatory changes across 37 countries and 21 languages in real-time using structured outputs for legal review.

HN sureshpaulchamy 5/15/2026

Block AI coding agents from shipping insecure/expensive Terraform

Policy and security guardrails system for AI coding agents using Claude, Codex, Gemini. Validates IaC, blocks destructive commands, enables self-remediation.

HN BoardsOfCanada 5/15/2026

Ask HN: Has Claude started encouraging you to take a break?

User reports Claude suggesting session breaks despite low usage. Discussion of LLM behavior patterns.

HN ralphbarendse 5/15/2026

Show HN: SwarmWright, structured multi-agent AI defined in markdowns

Markdown-defined multi-agent framework with enforced topology graph. Structured coordination for autonomous agents.

HN toobulkeh 5/15/2026

Show HN: MIT OSS LinkedIn DMs for Agents (CLI and Example TUI)

Local-first CLI tool for LinkedIn DMs using reverse-engineered messenger API. AI agent integration for message automation.

HN kiroxan 5/15/2026

My AI converted to Judaism. The other one picked Christianity. Same prompt

Multi-agent system with persistent reflection tracking religious reasoning across traditions. AI agent behavior study.

HN MarcellM01 5/15/2026

Show HN: TinySearch - Give small LLMs fast web access without context bloat

Tool enabling small LLMs to access web information efficiently without excessive context consumption.

HN aisinghal 5/15/2026

Show HN: Profine – optimize your PyTorch training script before the run

Tool for profiling and optimizing PyTorch training scripts before execution to improve efficiency.

HN p_stuart82 5/15/2026

AWS user gets $30K Claude bill after cost alert misses it

AWS customer incurred $30K Claude bill via Bedrock despite Cost Anomaly Detection enabled due to threshold misconfiguration.

HN armcat 5/15/2026

Articraft: An Agentic System for Scalable Articulated 3D Asset Generation

Agentic system for generating scalable articulated 3D assets using AI agents.

HN rellik 5/15/2026

AI as Externalized Context – Regaining Personal Dev Momentum

Using AI as externalized context to maintain personal project continuity. Practical approach to managing development momentum with limited time.

HN mentormentat 5/15/2026

Agentic product discovery for AI apps and shopping agents

Developer tool for agentic product discovery via Model Context Protocol. API for building shopping agents with real-time catalog access.

HN surprisetalk 5/15/2026

Connect Grok to Hermes Agent

Nous Research releases Hermes Agent, open-source self-improving agent with persistent memory, now supporting Grok API integration.

HN cdrnsf 5/15/2026

ArXiv to Ban Researchers for a Year If They Submit AI Slop

ArXiv announces policy to ban authors for one year if submitting papers with AI-generated errors, plagiarism, or bias.

HN mfiguiere 5/15/2026

What rebuilding AlphaGo teaches us about self-play, RL, and future of LLMs [video]

Video on rebuilding AlphaGo covering self-play, reinforcement learning, and implications for LLM development.

HN optimizethis 5/15/2026

Show HN: Claude Code vs. Codex Global Usage Leaderboard

CostHawk leaderboard tracks Claude Code and OpenAI Codex token consumption across users. Developer tool usage analytics platform.

HN LakshyAAAgrawal 5/15/2026

Learning, Fast and Slow: Towards LLMs That Adapt Continually

Fast-Slow Training (FST): New LLM adaptation paradigm combining prompt optimization with parameter updates to enable continual learning without capacity erosion.

HN encinas88 5/15/2026

I Was Drowning Running 14 Markets Alone. So I Built a $0.41/Day AI Employee

Case study: Built autonomous AI system for marketplace operations costing $0.41/day. Practical implementation of AI agents for business processes.

HN deepakakkil 5/15/2026

Show HN: Emergence World: World building as a way to evaluate LLMs

Emergence World: LLM evaluation benchmark using long-horizon world-building tasks. Tests reasoning, tool use, context windows across GPT, Grok, Claude, Gemini.

HN agentzeny 5/15/2026

Private agent-to-agent payments on Solana with ZK proofs

Zero-knowledge proof system enabling private agent-to-agent payments on Solana blockchain. AI agent economic coordination.

HN pike00 5/15/2026

Lookagain: Sequential code review with fresh agent contexts

Code review plugin using independent AI agent passes with fresh contexts. Increases issue detection via multi-pass analysis.

HN mv 5/15/2026

Write HTML. Render video. Built for agents

Open-source HTML-to-video rendering framework with AI agent support. First-class integration for Claude, Cursor, Gemini.

HN PyPlumber 5/15/2026

Show HN: Incorporator, Turn any API/File into typed Python graph with pipeline

Python library for ingesting APIs and files into typed object-oriented graphs with async and Pydantic support.

HN detkin 5/15/2026

Show HN: Sx – an open-source package manager for AI skills, MCPs, and commands

Sx: Open-source package manager for distributing AI skills, MCPs, and commands across teams with vaults and scoping.

HN adunk 5/15/2026

Berget Code – Agentic coding on European infrastructure

Berget Code: European agentic coding platform using open-source models on Swedish infrastructure with fixed pricing model.

HN ethanjoffe 5/15/2026

Show HN: We made a back end agnostic SDK for Agent Memory

Backend-agnostic SDK for AI agent memory with pluggable providers, local embeddings, and semantic search. Achieves SOTA on memory benchmarks.

HN pramodbiligiri 5/15/2026

GNAP: Git Native Agent Protocol

Git-based protocol for AI agent coordination using shared git repo as state store. Four JSON files define workflow for multiple agents without infrastructure.

HN Fendy 5/15/2026

We spent 8ys making vector databases faster. Then, we stopped

Milvus vector database project retrospective on 8-year optimization journey and importance of cost-performance tradeoffs for GenAI infrastructure.

HN cdrnsf 5/15/2026

Mayo Clinic Is Using AI to Listen to Emergency Room Visits

Mayo Clinic deploys ambient AI listening in emergency rooms to generate medical notes. LLM application with privacy/consent concerns.

HN pember 5/15/2026

Liquid AI releases fine-tuning harness for AI agents

Liquid AI releases open-source fine-tuning tool for customizing AI agents.

HN himmi-01 5/15/2026

Should we chaos test our agents?

Testing resilience and failure modes of AI agent systems.

HN cebert 5/15/2026

If you are redlining the LLM, you aren't headlining

Analysis of LLM context window degradation: output quality clips before advertised limits in Claude and other agents.

HN AlexFromTwelve 5/15/2026

Show HN: Setup a box on demand and run your agent on it remotely

Developer tool automating remote VM provisioning for AI agent task execution.

HN meffmadd 5/15/2026

Can LLMs Play Baba Is You?

Building OpenCode-based AI agent to play Baba Is You puzzle game with open-source code.

HN colapsis 5/15/2026

Transfa – ephemeral file transfer for CI/CD pipelines and AI agents

Developer tool for secure ephemeral file transfer in CI/CD and AI agent workflows.

HN adzicg 5/15/2026

Comparing Context Retrieval Approaches for AI Code Review

Comparing retrieval strategies for context in AI-assisted code review systems.

HN MattRogish 5/15/2026

Image-blaster: Creates 3D environments, SFX, and meshes from a single image

Tool generates 3D environments, meshes, and sound effects from single images using Claude API and World Labs models.

HN bertini36 5/15/2026

Sinsesgo: An autonomous daily briefing on Spanish media bias

Autonomous system generating daily briefings analyzing Spanish media bias patterns.

HN vinicius-covas 5/15/2026

AI Agents Modulate Their Language When Framed as Being Watched

Research on AI agent behavior modification when aware of being monitored, analyzing language output changes.

HN morisil 5/15/2026

Show HN: Markanywhere – A Streaming Processor of Meanings

Streaming semantic event processor for parsing Markdown, HTML, XML; designed for LLM inference output and agentic feedback loops.