Ax William Merrill, Yanhong Li, Tyler Romero, Anej Svete, Caia Costello, Pradeep Dasigi, Dirk Groeneveld, David Heineman, Bailey Kuehl, Nathan Lambert, Chuan Li, Kyle Lo, Saumya Malik, DJ Matusz, Benjamin Minixhofer, Jacob Morrison, Luca Soldaini, Finbarr Timbers, Pete Walsh, Noah A. Smith, Hannaneh Hajishirzi, Ashish Sabharwal 3d ago

Olmo Hybrid: From Theory to Practice and Back

OLMo Hybrid: theoretical and empirical analysis of hybrid models combining linear RNNs and attention as alternatives to pure transformers with scaling benefits.

Ax David Sewell, Xingjian Li, Stepan Tretiakov, Krishna Kumar, David Fridovich-Keil 3d ago

Neural Operators for Multi-Task Control and Adaptation

Neural operator methods for multi-task optimal control problems, mapping task descriptions to control policies using permutation-invariant architectures.

Ax Wenjing Gong, Udbhav Srivastava, Yuchen Wang, Yuhao Jia, Qifan Wu, Weishan Bai, Yifan Yang, Xiao Huang, Xinyue Ye 3d ago

Earth Embeddings Reveal Diverse Urban Signals from Space

Benchmark of Earth embedding models (AlphaEarth, Prithvi, Clay) for neighborhood-scale urban monitoring from satellite imagery.

Ax Haocheng Ju, Guoxiong Gao, Jiedong Jiang, Bin Wu, Zeming Sun, Leheng Chen, Yutong Wang, Yuefeng Wang, Zichen Wang, Wanyi He, Peihao Wu, Liang Xiao, Ruochuan Liu, Bryan Dai, Bin Dong 3d ago

Automated Conjecture Resolution with Formal Verification

Framework for automated mathematical conjecture resolution combining LLMs with formal verification to improve reliability of research-level mathematical problem solving.

Ax Xiwen Chen, Jingjing Wang, Wenhui Zhu, Peijie Qiu, Xuanzhao Dong, Hejian Sang, Zhipeng Wang, Alborz Geramifard, Feng Luo 3d ago

SODA: Semi On-Policy Black-Box Distillation for Large Language Models

SODA: Semi on-policy knowledge distillation method for LLMs balancing off-policy simplicity with on-policy effectiveness without adversarial training instability.

Ax Shenzhi Yang, Guangcheng Zhu, Bowen Song, Sharon Li, Haobo Wang, Xing Zheng, Yingfan Ma, Zhongqi Chen, Weiqiang Wang, Gang Chen 3d ago

Can LLMs Learn to Reason Robustly under Noisy Supervision?

Analysis of LLM reasoning models under noisy labels in reinforcement learning with verifiable rewards, identifying label noise vulnerabilities.

Ax Haonian Ji, Kaiwen Xiong, Siwei Han, Peng Xia, Shi Qiu, Yiyang Zhou, Jiaqi Liu, Jinlong Li, Bingzhou Li, Zeyu Zheng, Cihang Xie, Huaxiu Yao 3d ago

ClawArena: Benchmarking AI Agents in Evolving Information Environments

ClawArena benchmark for evaluating AI agents in dynamic environments with evolving information, contradictions, and implicit user feedback.