Ax Shaofeng Yin, Jiaxin Ge, Zora Zhiruo Wang, Chenyang Wang, Xiuyu Li, Michael J. Black, Trevor Darrell, Angjoo Kanazawa, Haiwen Feng 2d ago

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

Vision-language agent framework combining inverse graphics with interleaved multimodal reasoning for reconstructing images into editable programs with spatial grounding.

Ax Ellen Xiaoqing Tan, Jack Lanchantin, Shehzaad Dhuliawala, Danwei Li, Thao Nguyen, Jing Xu, Ping Yu, Ilia Kulikov, Sainbayar Sukhbaatar, Jason Weston, Xian Li, Olga Golovneva 2d ago

Self-Improving Pretraining: using post-trained models to pretrain better models

Method improving language model pretraining by using post-trained models as data sources to instill desired behaviors like safety and reasoning earlier in training.

Ax Chanhyuk Lee, Jaehoon Yoo, Manan Agarwal, Sheel Shah, Jerry Huang, Aditi Raghunathan, Seunghoon Hong, Nicholas M. Boffi, Jinwoo Kim 2d ago

Flow Map Language Models: One-step Language Modeling via Continuous Denoising

Language model based on continuous flows over token embeddings demonstrating faster generation than discrete diffusion and autoregressive models with improved few-step quality.

Ax Delip Rao, Chris Callison-Burch 2d ago

Autorubric: Unifying Rubric-based LLM Evaluation

Open-source framework unifying rubric-based LLM evaluation techniques including ensemble judging, bias mitigation, and few-shot calibration with consistent implementation.

Ax Tenny Yin, Zhiting Mei, Zhonghe Zheng, Miyu Yamane, David Wang, Jade Sceats, Samuel M. Bateman, Lihan Zha, Apurva Badithela, Ola Shorinwa, Anirudha Majumdar 2d ago

PlayWorld: Learning Robot World Models from Autonomous Play

PlayWorld pipeline training action-conditioned video models on autonomous robot play data for improved world model physics prediction.

Ax Angelika Romanou, Mark Ibrahim, Candace Ross, Chantal Shaib, Kerem Oktar, Samuel J. Bell, Anaelia Ovalle, Jesse Dodge, Antoine Bosselut, Koustuv Sinha, Adina Williams 2d ago

Brittlebench: Quantifying LLM robustness via prompt sensitivity

Brittlebench framework quantifying LLM robustness through prompt sensitivity evaluation beyond static benchmarks.

Ax Yufei Xu, Fanxu Meng, Fan Jiang, Yuxuan Wang, Ruijie Zhou, Zhaohui Wang, Jiexi Wu, Zhixin Pan, Xiaojuan Tang, Wenjie Pei, Tongxuan Liu, Di Yin, Xing Sun, Muhan Zhang 2d ago

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

Hierarchical indexing method HISA optimizes sparse attention mechanisms in LLMs by reducing indexer bottlenecks in token selection.

Ax Qing Lyu, Jianxu Wang, Jeremy Hudson, Ge Wang, Chirstopher T. Whitlow 2d ago

MRI-to-CT synthesis using drifting models

Medical imaging technique using diffusion models to synthesize CT images from MRI for pelvic imaging without ionizing radiation.