Ax Baorong Shi, Bo Cui, Boyuan Jiang, Deli Yu, Fang Qian, Haihua Yang, Huichao Wang, Jiale Chen, Jianfei Pan, Jieqiong Cao, Jinghao Lin, Kai Wu, Lin Yang, Shengsheng Yao, Tao Chen, Xiaojun Xiao, Xiaozhong Ji, Xu Wang, Yijun He, Zhixiong Yang 27d ago

MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

MedXIAOHE, a medical multimodal foundation model with entity-aware continual pretraining, achieves state-of-the-art on clinical benchmarks.

Ax David Puertolas Merenciano, Ekaterina Vasyagina, Kevin Zhu, Javier Ferrando, Maheep Chaudhary 27d ago

Weight space Detection of Backdoors in LoRA Adapters

Method to detect backdoor attacks in LoRA adapters without test inputs by analyzing weight space, addressing security vulnerabilities in shared model repositories.

Ax Maria Rosaria Briglia, Simone Facchiano, Paolo Cursi, Alessio Sampieri, Emanuele Rodol\`a, Guido Maria D'Amely di Melendugno, Luca Franco, Fabio Galasso, Iacopo Masi 27d ago

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

HyCon: Hyperbolic control mechanism for steering text-to-image models away from unsafe concepts using parallel transport instead of Euclidean adjustments.

Ax Dang Nguyen, Harvey Yiyun Fu, Peter West, Ari Holtzman, Chenhao Tan 27d ago

Moral Mazes in the Era of LLMs

HR Simulator: Game-based evaluation of LLMs navigating complex workplace social norms like giving feedback and rejecting requests appropriately.

Ax Yuntong Zhang, Zhiyuan Pan, Imam Nur Bani Yusuf, Haifeng Ruan, Ridwan Shariffdeen, Abhik Roychoudhury 27d ago

Code Review Agent Benchmark

Code Review Agent Benchmark: Dataset and evaluation framework for assessing AI agents' ability to review code quality in generated codebases.

Ax Ryszard Tuora, Mateusz Gali\'nski, Micha{\l} Godziszewski, Micha{\l} Karpowicz, Mateusz Czy\.znikiewicz, Adam Kozakiewicz, Tomasz Zi\k{e}tkiewicz 27d ago

UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough

Compares GraphRAG with VectorRAG for retrieval-augmented generation, showing simpler vector-based approaches handle chunk relationships effectively.

Ax Gabriel Sarch, Linrong Cai, Qunzhong Wang, Haoyang Wu, Danqi Chen, Zhuang Liu 27d ago

Vero: An Open RL Recipe for General Visual Reasoning

Vero: Open-source family of vision-language models matching proprietary systems on visual reasoning tasks using reinforcement learning with public recipes and data.

HN b-man 27d ago

Release Please

Marketing content for oral dissolving peptides supplement product.

HN videobroker 27d ago

Law Students: AI Is Changing Things

Overview of how AI is transforming legal work by automating research, document review, and drafting tasks for lawyers and paralegals.

HN jaypatelani 27d ago

The BSDs in the AI Age

Thread initiation post about BSD operating systems and AI with minimal content and formatting guidelines.