Ax Chin-Chia Michael Yeh, Uday Singh Saini, Xin Dai, Xiran Fan, Shubham Jain, Yujie Fan, Jiarui Sun, Junpeng Wang, Menghai Pan, Yingtong Dou, Yuzhong Chen, Vineeth Rakesh, Liang Wang, Yan Zheng, Mahashweta Das 20d ago

TREASURE: The Visa Payment Foundation Model for High-Volume Transaction Understanding

TREASURE foundation model for transaction understanding in payment networks, enabling anomaly detection and consumer insights at scale.

Ax Run Shao, Ziyu Li, Zhaoyang Zhang, Linrui Xu, Xinran He, Hongyuan Yuan, Bolei He, Yongxing Dai, Yiming Yan, Yijun Chen, Wang Guo, Haifeng Li 20d ago

Asking like Socrates: Socrates helps VLMs understand remote sensing images

Socratic questioning framework improving VLM understanding of remote sensing images by addressing pseudo-reasoning and incomplete perception issues.

Ax Junnan Liu, Hongwei Liu, Songyang Zhang, Kai Chen 20d ago

Rectifying LLM Thought from Lens of Optimization

Analysis of chain-of-thought reasoning in LLMs from optimization lens, addressing overthinking and performance issues in long-CoT prompting.

Ax Tianxin Xie, Wentao Lei, Kai Jiang, Guanjie Huang, Pengfei Zhang, Chunhui Zhang, Fengji Ma, Haoyu He, Han Zhang, Jiangshan He, Jinting Wang, Linghan Fang, Lufei Gao, Orkesh Ablet, Peihua Zhang, Ruolin Hu, Shengyu Li, Weilin Lin, Xiaoyang Feng, Xinyue Yang, Yan Rong, Yanyun Wang, Zihang Shao, Zelin Zhao, Chenxing Li, Shan Yang, Wenfu Wang, Meng Yu, Dong Yu, Li Liu 20d ago

PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

Benchmark for evaluating physics-grounded audio in text-to-audio-video generation models.

Ax Alex Morehead, Miruna Cretu, Antonia Panescu, Rishabh Anand, Maurice Weiler, Tynan Perez, Samuel Blau, Steven Farrell, Wahid Bhimji, Anubhav Jain, Hrushikesh Sahasrabuddhe, Pietro Lio, Tommi Jaakkola, Rafael Gomez-Bombarelli, Rex Ying, N. Benjamin Erichson, Michael W. Mahoney 20d ago

Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials

Open-source foundation model for 3D chemical systems combining generative and predictive capabilities for molecules and materials.

Ax Aichen Cai, Anmeng Zhang, Anyu Li, Bo Zhang, Bohua Cai, Chang Li, Changjian Jiang, Changkai Lu, Chao Xue, Chaocai Liang, Cheng Zhang, Dongkai Liu, Fei Wang, Guoqiang Huang, Haijian Ke, Han Lin, Hao Wang, Ji Miao, Jiacheng Zhang, Jialong Shi, Jifeng Zhu, Jingjing Qian, Junhui Luo, Junwu Xiong, Lam So, Liang Huang, Ming Ke, Mingyang Li, Panfeng Shi, Peng Hao, Qi Wang, Qian Lai, Qiaoqiao Yuan, Qingyu Yin, Qiong Cao, Qixiang Wang, Rongcheng Bian, Rongduo Han, Shaoqiang Zheng, Shi Hu, Shi Suo, Shijie Ren, Shijin Zhang, Shiying Fan, Shuai Xie, Tianyi Zhang, Wei Liu, Wentao Tan, Xianghan Meng, Xiaodong He, Xing Pan, Xiran Wang, Xuyang Peng, Ya Zhang, Yang Liu, Yangyang Duan, Yanxu Chen, Yicheng Gong, Yidan Huang, Yifei Liu, Yinhao Bai, Yongqiang Liu, Yuesong Zhang, Yuqi Zhang, Zerui Xie, Zhenfang Wang, Zhennan Shen, Zheyuan Liu, Zhuwei Zeng 20d ago

JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

JoyAI-LLM Flash: Efficient Mixture-of-Experts language model in sub-50B parameter range, pretrained on 20 trillion tokens with optimized post-training.

Ax Xiaoan Liu, DaeHo Lee, Eric J Gonzalez, Mar Gonzalez-Franco, Ryo Suzuki 20d ago

VisionClaw: Always-On AI Agents through Smart Glasses

VisionClaw: Always-on wearable AI agent on Meta Ray-Ban glasses, integrating egocentric perception with speech-driven OpenClaw task execution.

Ax Ahsan Bilal, Muhammad Ahmed Mohsin, Muhammad Umer, Asad Aali, Muhammad Usman Khanzada, Muhammad Usman Rafique, Zihao He, Emily Fox, Dean F. Hougen 20d ago

$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models

S³: stratified scaling search for test-time inference in diffusion language models using classical verifiers to improve generation without additional training.

Ax Apimuk Sornsaeng, Si Min Chan, Wenxuan Zhang, Swee Liang Wong, Joshua Lim, Dario Poletti 20d ago

SMT-AD: a scalable quantum-inspired anomaly detection approach

Quantum-inspired tensor network anomaly detection (SMT-AD) using superposition of bond-dimension-1 matrix product operators with Fourier feature embeddings.

Ax Zihan Wang, Chi Gui, Xing Jin, Qineng Wang, Licheng Liu, Kangrui Wang, Shiqi Chen, Linjie Li, Zhengyuan Yang, Pingyue Zhang, Yiping Lu, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li 20d ago

RAGEN-2: Reasoning Collapse in Agentic RL

RAGEN-2 identifies reasoning collapse in RL-trained multi-turn LLM agents where models use input-agnostic templates despite stable entropy metrics.