Isolater - Feed

Ax Xi He, Sirui Lu, Bei Zeng 2d ago

Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems

Multi-agent system with Lean 4 verification layer for exact scientific discovery in quantum code design, combining symbolic synthesis and automated verification.

Ax Tong Ma, Hui Lai, Hui Wang, Zhenhu Tian, Chaochao Li, Fengjie Xu, Ling Fang 2d ago

ATLAS: A Layered Constraint-Guided Framework for Structured Artifact Generation in LLM-Assisted MDE

ATLAS framework combines LLMs with model-driven workflows for generating structured artifacts that satisfy schemas, domain rules, and audit requirements through constraint compilation and validation.

Ax Irina Proskurina, Marc-Antoine Carpentier, Julien Velcin 2d ago

HatePrototypes: Interpretable and Transferable Representations for Implicit and Explicit Hate Speech Detection

Interpretable model for detecting implicit and explicit hate speech using prototype-based representations for transfer learning.

Ax Qibing Ren, Zhijie Zheng, Jiaxuan Guo, Junchi Yan, Lizhuang Ma, Jing Shao 2d ago

When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

Research on financial fraud risks from collaborative LLM agents including MultiAgentFraudBench for simulating multi-agent fraud scenarios.

Ax Tommy Sha (Kindred), Zhan Cheng (Kindred), Haotian Zhai (Kindred), Xuwei Ding (Kindred), Junnan Li (Kindred), Haixiang Tang (Kindred), Zaoting Sun (Kindred), Yanchuan Tang (Kindred), Yongzhe (Kindred), Yi, Yuan Gao, Anhao Li 2d ago

FAST-CAD: A Fairness-Aware Framework for Non-Contact Stroke Diagnosis

Fairness-aware stroke diagnosis framework combining domain-adversarial training with group distributionally robust optimization.

Ax Md Tanvirul Alam, Saksham Aggarwal, Justin Yang Chae, Nidhi Rastogi 2d ago

SPHINX: A Synthetic Environment for Visual Perception and Reasoning

Synthetic environment generating visual reasoning puzzles with ground-truth solutions across 25 task types for benchmark construction.

Ax Lingdong Wang, Guan-Ming Su, Divya Kothandaraman, Tsung-Wei Huang, Mohammad Hajiesmaili, Ramesh K. Sitaraman 2d ago

Low-Bitrate Video Compression through Semantic-Conditioned Diffusion

Video compression framework using semantic conditioning and diffusion models for ultra-low bitrate encoding.

Ax Qi'ao Xu, Tianwen Qian, Yuqian Fu, Kailing Li, Yang Jiao, Jiacheng Zhang, Xiaoling Wang, Liang He 2d ago

ToG-Bench: Task-Oriented Spatio-Temporal Grounding in Egocentric Videos

Benchmark for task-oriented spatio-temporal grounding in egocentric videos for embodied AI agents.

Ax Jia Hu, Zhexi Lian, Xuerun Yan, Ruiang Bi, Dou Shen, Yu Ruan, Chunlong Xia, Haoran Wang 2d ago

MPCFormer: A physics-informed data-driven approach for explainable socially-aware autonomous driving

Physics-informed transformer model for socially-aware autonomous driving that learns social interaction dynamics.

Ax Huiguo He, Pengyu Yan, Ziqi Yi, Weizhi Zhong, Zheng Liu, Yejun Tang, Huan Yang, Guanbin Li, Lianwen Jin 2d ago

ContextDrag: Precise Drag-Based Image Editing via Context-Preserving Token Injection and Position-Aligned Attention

Drag-based image editing method using diffusion models with token injection and attention mechanisms for precise visual manipulation.

Ax Junhyung Park, Yuqing Zhou 2d ago

A fine-grained look at causal effects in causal spaces

Theoretical framework for analyzing causal effects at fine-grained levels in high-dimensional data like images and language models.

Ax Chao Wen, Tung Phung, Pronita Mehrotra, Sumit Gulwani, Roger E. Beaty, Tomohiro Nagashima, Adish Singla 2d ago

Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models

Interface design study on scaffolding divergent and convergent thinking in human-AI co-creation with generative models.

Ax Minh V. T. Thai, Tue Le, Dung Nguyen Manh, Huy Phan Nhat, Nghi D. Q. Bui 2d ago

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

SWE-EVO benchmark for evaluating AI coding agents on long-horizon software evolution tasks spanning multiple files and iterations.

Ax Qianli Wang, Van Bach Nguyen, Yihong Liu, Fedor Splitt, Nils Feldhus, Christin Seifert, Hinrich Sch\"utze, Sebastian M\"oller, Vera Schmitt 2d ago

Parallel Universes, Parallel Languages: A Comprehensive Study on LLM-based Multilingual Counterfactual Example Generation

Research on using LLMs to generate multilingual counterfactual examples for model interpretability across languages.

Ax Zihua Yang, Xin Liao, Yiqun Zhang, Yiu-ming Cheung 2d ago

Bridging the Semantic Gap for Categorical Data Clustering via Large Language Models

LLM-based method for categorical data clustering that leverages semantic understanding to measure similarity among attribute values lacking inherent ordering.

Ax Sitong Wang, Anh Truong, Lydia B. Chilton, Dingzeyu Li 2d ago

Rewriting Video: Text-Driven Reauthoring of Video Footage

Text-driven video reauthoring interface and study exploring how creators can edit video footage through natural language prompts rather than manual editing.

Ax Rachmadita Andreswari, Stephan A. Fahrenkrog-Petersen, Jan Mendling 2d ago

Fairness in Healthcare Processes: A Quantitative Analysis of Decision Making in Triage

Analysis of fairness in automated decision-making for healthcare emergency triage using process mining and fairness-aware algorithms on empirical data.

Ax Shaofeng Yin, Jiaxin Ge, Zora Zhiruo Wang, Chenyang Wang, Xiuyu Li, Michael J. Black, Trevor Darrell, Angjoo Kanazawa, Haiwen Feng 2d ago

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

Vision-language agent framework combining inverse graphics with interleaved multimodal reasoning for reconstructing images into editable programs with spatial grounding.

Ax Daniel Ogenrwot, John Businge 2d ago

How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests

Large-scale empirical study analyzing how AI coding agents modify code and describe changes in GitHub pull requests compared to human contributions.

Ax Viacheslav Sydora, Guner Dilsad Er, Michael Muehlebach 2d ago

Teaching Machine Learning Fundamentals with LEGO Robotics

Open-source web platform and course teaching machine learning fundamentals to students aged 12-17 using LEGO robotics without programming.

Ax Ellen Xiaoqing Tan, Jack Lanchantin, Shehzaad Dhuliawala, Danwei Li, Thao Nguyen, Jing Xu, Ping Yu, Ilia Kulikov, Sainbayar Sukhbaatar, Jason Weston, Xian Li, Olga Golovneva 2d ago

Self-Improving Pretraining: using post-trained models to pretrain better models

Method improving language model pretraining by using post-trained models as data sources to instill desired behaviors like safety and reasoning earlier in training.

Ax Henri A\"idasso, Francis Bordeleau, Ali Tizghadam 2d ago

Predicting Intermittent Job Failure Categories for Diagnosis Using Few-Shot Fine-Tuned Language Models

Few-shot fine-tuned LLM approach for categorizing intermittent CI pipeline failures caused by flaky tests and infrastructure issues rather than code defects.

Ax Lv Tang, Tianyi Zheng, Bo Li, Xingyu Li 2d ago

InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs

Information-theoretic framework for optimizing shared visual tokenization in unified multimodal models that perform both image understanding and generation.

Ax Jiacheng Liang, Yuhui Wang, Tanqiu Jiang, Ting Wang 2d ago

RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

Safety alignment approach for Mixture-of-Experts language models addressing unique challenges from sparse routing mechanisms during fine-tuning.

Ax Xin Wu, Zhixuan Liang, Yue Ma, Mengkang Hu, Zhiyuan Qin, Xiu Li 2d ago

ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs

Benchmark framework for evaluating multimodal large language models on spatio-temporal bimanual coordination tasks requiring synchronized multi-stream integration.

Ax William Lugoloobi, Thomas Foster, William Bankes, Chris Russell 2d ago

LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

Method using linear probes on LLM pre-generation activations to predict success likelihood before generation, enabling selective deployment of expensive extended reasoning.

Ax Yi Feng, Chen Huang, Zhibo Man, Ryner Tan, Long P. Hoang, Shaoyang Xu, Wenxuan Zhang 2d ago

MoltNet: Understanding Social Behavior of AI Agents in the Agent-Native MoltBook

Study examining emergent social behavior and interactions among large-scale communities of AI agents on MoltBook, a social platform designed for agent-agent communication.

Ax Yuchen Yang, Wenze Lin, Enhao Huang, Zhixuan Chu, Hongbin Zhou, Lan Tao, Yiming Li, Zhan Qin, Kui Ren 2d ago

Explainable Token-level Noise Filtering for LLM Fine-tuning Datasets

Token-level noise filtering framework for LLM fine-tuning datasets that identifies and explains problematic tokens to improve downstream task performance.

Ax Chanhyuk Lee, Jaehoon Yoo, Manan Agarwal, Sheel Shah, Jerry Huang, Aditi Raghunathan, Seunghoon Hong, Nicholas M. Boffi, Jinwoo Kim 2d ago

Flow Map Language Models: One-step Language Modeling via Continuous Denoising

Language model based on continuous flows over token embeddings demonstrating faster generation than discrete diffusion and autoregressive models with improved few-step quality.

Ax Delip Rao, Chris Callison-Burch 2d ago

Autorubric: Unifying Rubric-based LLM Evaluation

Open-source framework unifying rubric-based LLM evaluation techniques including ensemble judging, bias mitigation, and few-shot calibration with consistent implementation.

Ax Kihoon Son, Hyewon Lee, DaEun Choi, Yoonsu Kim, Tae Soo Kim, Yoonjoo Lee, John Joon Young Chung, HyunJoon Jung, Juho Kim 2d ago

"When to Hand Off, When to Work Together": Expanding Human-Agent Co-Creative Collaboration through Concurrent Interaction

Research on human-AI agent collaboration exploring how agents can maintain workspace awareness and interpret concurrent user actions on shared artifacts during co-creative tasks.

Ax Addison Kalanther, Sanika Bharvirkar, Shankar Sastry, Chinmay Maheshwari 2d ago

NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

Research on multi-agent reinforcement learning algorithm (NePPO) addressing training stability and convergence in general-sum games with heterogeneous agents.

Ax Tenny Yin, Zhiting Mei, Zhonghe Zheng, Miyu Yamane, David Wang, Jade Sceats, Samuel M. Bateman, Lihan Zha, Apurva Badithela, Ola Shorinwa, Anirudha Majumdar 2d ago

PlayWorld: Learning Robot World Models from Autonomous Play

PlayWorld pipeline training action-conditioned video models on autonomous robot play data for improved world model physics prediction.

Ax Brian Freeman, Adam Kicklighter, Matt Erdman, Zach Gordon 2d ago

Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction

Five prompt engineering strategies to reduce LLM hallucinations and improve consistency in industrial applications like design and IoT.

Ax Edward Y. Chang 2d ago

Exploring Collatz Dynamics with Human-LLM Collaboration

Human-LLM collaboration developing structural framework for Collatz map dynamics with theoretical proofs.

Ax Yuning Wu, Ke Wang, Devin Chen, Kai Wei 2d ago

Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

Hindsight-anchored policy optimization method addressing advantage collapse in sparse-reward RL for reasoning model post-training.

Ax Ofir Marom 2d ago

UtilityMax Prompting: A Formal Framework for Multi-Objective Large Language Model Optimization

UtilityMax Prompting framework using formal mathematical language to specify multi-objective LLM tasks with influence diagrams.

Ax Konstantin Krestnikov 2d ago

Truth as a Compression Artifact in Language Model Training

Controlled experiments showing LMs prefer correct answers because error compressibility structure guides learning, not inherent truth preference.

Ax Ninghui Li, Kaiyuan Zhang, Kyle Polley, Jerry Ma 2d ago

Security Considerations for Artificial Intelligence Agents

Perplexity's recommendations on security considerations for frontier AI agents based on operating agentic systems at scale.

Ax Siddharth Srikanth, Freddie Liang, Ya-Chuan Hsu, Varun Bhatt, Shihan Zhao, Henry Chen, Bryon Tjanaka, Minjune Hwang, Akanksha Saran, Daniel Seita, Aaquib Tabrez, Stefanos Nikolaidis 2d ago

Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

Quality diversity optimization for red-teaming vision-language-action robot models to improve robustness against prompt variations.

Ax Angelika Romanou, Mark Ibrahim, Candace Ross, Chantal Shaib, Kerem Oktar, Samuel J. Bell, Anaelia Ovalle, Jesse Dodge, Antoine Bosselut, Koustuv Sinha, Adina Williams 2d ago

Brittlebench: Quantifying LLM robustness via prompt sensitivity

Brittlebench framework quantifying LLM robustness through prompt sensitivity evaluation beyond static benchmarks.

Ax Kushal Khemani (Independent Researcher, India), Anjum Nazir Qureshi (Rajiv Gandhi College of Engineering Research,Technology) 2d ago

AI-Driven Predictive Maintenance with Environmental Context Integration for Connected Vehicles: Simulation, Benchmarking, and Field Validation

Contextual data fusion framework integrating vehicle sensors with environmental signals for predictive maintenance in connected vehicles.

Ax Shidong He, Haoyu Wang, Wenjie Luo 2d ago

Generate Then Correct: Single Shot Global Correction for Aspect Sentiment Quad Prediction

Generates then corrects predictions for aspect sentiment quad prediction in fine-grained opinion mining tasks.

Ax Zhexi Lian, Haoran Wang, Xuerun Yan, Weimeng Lin, Xianhong Zhang, Yongyu Chen, Jia Hu 2d ago

Fine-tuning is Not Enough: A Parallel Framework for Collaborative Imitation and Reinforcement Learning in End-to-end Autonomous Driving

Proposes parallel framework combining imitation and reinforcement learning for end-to-end autonomous driving instead of sequential fine-tuning.

Ax Panayiotis Panayiotou, \"Ozg\"ur \c{S}im\c{s}ek 2d ago

Causal Discovery in Action: Learning Chain-Reaction Mechanisms from Interventions

Studies causal discovery in chain-reaction dynamical systems using interventional data with identifiability guarantees.

Ax Benjamin Lange 2d ago

Unilateral Relationship Revision Power in Human-AI Companion Interaction

Philosophical analysis of moral dimensions in human-AI companion interactions and provider control structures.

Ax Dogan Urgun, Gokhan Gungor 2d ago

Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

Framework using LLMs to automatically synthesize reward programs for cooperative multi-agent reinforcement learning systems.

Ax Xuepeng Jing, Wenhuan Lu, Hao Meng, Zhizhi Yu, Jianguo Wei 2d ago

TIGFlow-GRPO: Trajectory Forecasting via Interaction-Aware Flow Matching and Reward-Guided Optimization

Combines flow matching with reward optimization for trajectory forecasting in autonomous driving and crowd surveillance scenarios.

Ax Peiyuan Jiang, Yao Liu, Yanglei Gan, Jiaye Yang, Lu Liu, Daibing Yao, Qiao Liu 2d ago

MuDD: A Multimodal Deception Detection Dataset and GSR-Guided Progressive Distillation for Non-Contact Deception Detection

Proposes multimodal deception detection dataset using GSR-guided distillation to improve non-contact deception detection.

Ax Yoseph Berhanu Alebachew, Hunter Leary, Swanand Vaishampayan, Chris Brown 2d ago

Beyond Code Snippets: Benchmarking LLMs on Repository-Level Question Answering

Introduces StackRepoQA, a repository-level QA benchmark for evaluating LLMs on multi-file program comprehension tasks beyond isolated code snippets.