HN mihau 4/2/2026

LLM SQL Benchmark

Benchmark dataset evaluating LLM performance on SQL query generation and execution tasks.

HN theturtletalks 4/2/2026

Show HN: Agentic Commerce Marketplace Chat

Open-source agentic commerce marketplace alternative with flexible adapter system, supporting multiple backends as counter to proprietary solutions.

HN gmays 4/2/2026

A Mirror Test for LLMs

Self-awareness evaluation framework for large language models using mirror test methodology.

HN mihau 4/2/2026

Microsoft.ai

Microsoft announcement of voice and image generation models MAI-Voice-1 and MAI-Image-2.