Ax Esakkivel Esakkiraja, Sai Rajeswar, Denis Akhiyarov, Rajagopal Venkatesaramani 1d ago

Therefore I am. I Think

Analysis showing LLM reasoning models encode decisions before generating chain-of-thought explanations via linear probes.

Ax Shin'ya Yamaguchi, Kosuke Nishida, Daiki Chijiwa, Yasutoshi Ida 1d ago

Zero-shot Concept Bottleneck Models

Zero-shot concept bottleneck models enabling interpretable predictions without target task training by leveraging zero-shot learning.

Ax Jialin Yang, Dongfu Jiang, Lipeng He, Sherman Siu, Yuxuan Zhang, Disen Liao, Zhuofeng Li, Huaye Zeng, Yiming Jia, Haozhe Wang, Benjamin Schneider, Chi Ruan, Wentao Ma, Zhiheng Lyu, Yifei Wang, Yi Lu, Quy Duc Do, Ziyan Jiang, Ping Nie, Wenhu Chen 1d ago

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

StructEval benchmark systematically evaluates LLM capabilities in generating structured outputs across JSON, HTML, React, SVG and other formats.

Ax Rohit Kundu, Vishal Mohanty, Hao Xiong, Shan Jia, Athula Balachandran, Amit K. Roy-Chowdhury 1d ago

SAGA: Source Attribution of Generative AI Videos

SAGA framework for source attribution of AI-generated videos. Identifies specific generative model used instead of binary real/fake detection.

Ax Chengqi Dong, Chuhuai Yue, Hang He, Rongge Mao, Fenghe Tang, S Kevin Zhou, Zekun Xu, Xiaohan Wang, Jiajun Chai, Guojun Yin 1d ago

Training Multi-Image Vision Agents via End2End Reinforcement Learning

IMAgent: open-source visual agent trained with end-to-end RL for multi-image reasoning tasks, addressing limitations of single-image VLM agents.

Ax Sashuai Zhou, Qiang Zhou, Jijin Hu, Hanqing Yang, Yue Cao, Junpeng Ma, Yinchao Ma, Jun Song, Tiezheng Ge, Cheng Yu, Bo Zheng, Zhou Zhao 1d ago

Unified Thinker: A General Reasoning Modular Core for Image Generation

Open-source image generation model with improved reasoning for logic-intensive instruction following, closing gap to closed-source systems.

Ax Xiangyang Zhu, Yuan Tian, Qi Jia, Kaiwei Zhang, Zicheng Zhang, Chunyi Li, Kaiyuan Ji, Dongrui Liu, Zijian Chen, Lu Sun, Renrui Zhang, Yan Teng, Jing Shao, Wei Sun, Xia Hu, Yu Qiao, Guangtao Zhai 1d ago

SafeSci: Safety Evaluation of Large Language Models in Science Domains and Beyond

SafeSci: comprehensive benchmark and framework for evaluating LLM safety in scientific domains with multi-domain risk coverage and objective evaluation.