HN sorenbs 4d ago

GLM 5.1: Pelican Test

GLM-5.1 is a 754B parameter open-source LLM that demonstrates improved reasoning and multi-modal capabilities like unprompted SVG+CSS generation.

HN saikatsg 4d ago

Your parallel Agent limit

Analysis of cognitive load and limitations when managing multiple parallel AI agents, focusing on human-in-the-loop costs beyond throughput metrics.

HN motakuk 4d ago

Enterprise-Managed Authorization for MCP

Enterprise authorization system for Model Context Protocol (MCP) servers using centralized identity providers. Addresses deployment challenges in large organizations.

HN hunglee2 4d ago

China's AI Ethics Governance

Newsletter promotion about AI ethics governance in China. Mostly self-promotional content with no technical depth or original research.

HN gpi 4d ago

Two Years of Valkey

Retrospective on Valkey, the open-source Redis fork created two years ago after Redis license change to source-available model.

HN MrBuddyCasino 4d ago

"I started to lose my ability to code"

Incomplete article about losing coding ability. Truncated content without substantive information. Likely newsletter signup page.

BL 4d ago

Introducing the Child Safety Blueprint

OpenAI announces Child Safety Blueprint framework for combating AI-enabled child sexual exploitation, developed with NCMEC and law enforcement partners.

Ax Pawe{\l} Liskowski, Benjamin Han, Paritosh Aggarwal, Bowei Chen, Boxin Jiang, Nitish Jindal, Zihan Li, Aaron Lin, Kyle Schmaus, Jay Tayade, Weicheng Zhao, Anupam Datta, Nathan Wiegand, Dimitris Tsirogiannis 4d ago

Cortex AISQL: A Production SQL Engine for Unstructured Data

Snowflake's Cortex AISQL production engine integrates semantic operations into SQL for querying structured and unstructured data.

Ax Tianxin Xie, Wentao Lei, Kai Jiang, Guanjie Huang, Pengfei Zhang, Chunhui Zhang, Fengji Ma, Haoyu He, Han Zhang, Jiangshan He, Jinting Wang, Linghan Fang, Lufei Gao, Orkesh Ablet, Peihua Zhang, Ruolin Hu, Shengyu Li, Weilin Lin, Xiaoyang Feng, Xinyue Yang, Yan Rong, Yanyun Wang, Zihang Shao, Zelin Zhao, Chenxing Li, Shan Yang, Wenfu Wang, Meng Yu, Dong Yu, Li Liu 4d ago

PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

PhyAVBench benchmark evaluates physics-plausibility of audio in text-to-audio-video generation models.

Ax Jingsheng Zheng, Jintian Zhang, Yujie Luo, Yuren Mao, Yunjun Gao, Lun Du, Huajun Chen, Ningyu Zhang 4d ago

Can We Predict Before Executing Machine Learning Agents?

Research paper proposing predictive reasoning to replace costly physical execution in ML agent workflows using internalized execution priors.

Ax Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, Huichao Wang, Jiale Chen, Jianfei Pan, Jieqiong Cao, Jinghao Lin, Kai Wu, Lin Yang, Shengsheng Yao, Tao Chen, Xiaojun Xiao, Xiaozhong Ji, Xu Wang, Yijun He, Zhixiong Yang 4d ago

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

MedXIAOHE, a medical multimodal foundation model with entity-aware continual pretraining, achieves state-of-the-art on clinical benchmarks.

Ax David Puertolas Merenciano, Ekaterina Vasyagina, Kevin Zhu, Javier Ferrando, Maheep Chaudhary 4d ago

Weight space Detection of Backdoors in LoRA Adapters

Method to detect backdoor attacks in LoRA adapters without test inputs by analyzing weight space, addressing security vulnerabilities in shared model repositories.

Ax Maria Rosaria Briglia, Simone Facchiano, Paolo Cursi, Alessio Sampieri, Emanuele Rodol\`a, Guido Maria D'Amely di Melendugno, Luca Franco, Fabio Galasso, Iacopo Masi 4d ago

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

HyCon: Hyperbolic control mechanism for steering text-to-image models away from unsafe concepts using parallel transport instead of Euclidean adjustments.