Isolater - Feed

Ax Zeyu Wang, Cuiqianhe Du, Renyue Zhang, Kejian Tong, Qi He, Qiyuan Tian 26d ago

Adaptive Serverless Resource Management via Slot-Survival Prediction and Event-Driven Lifecycle Control

Adaptive serverless resource management framework using slot-survival prediction and event-driven architecture to optimize cold start latency and utilization.

Ax Dongying Lin, Yinan Liu, Shengwei tang, Bin Wang, Xiaochun Yang 26d ago

OntoTKGE: Ontology-Enhanced Temporal Knowledge Graph Extrapolation

OntoTKGE model for temporal knowledge graph extrapolation leveraging ontological knowledge to handle sparse historical interactions and enable behavioral pattern inheritance.

Ax Xiaotian Zhou, Di Tang, Xiaofeng Wang, Xiaozhong Liu 26d ago

Can We Trust a Black-box LLM? LLM Untrustworthy Boundary Detection via Bias-Diffusion and Multi-Agent Reinforcement Learning

GMRL-BD algorithm using bias-diffusion and multi-agent RL to detect untrustworthy topic boundaries of LLMs, identifying domains where model answers cannot be reliably trusted.

Ax Yi Nian, Aojie Yuan, Haiyue Zhang, Jiate Li, Yue Zhao 26d ago

Auditable Agents

Auditable Agents framework establishing accountability, auditability, and auditing definitions for LLM agents with external effects, addressing post-deployment answer-ability.

Ax Chengyi Yang, Pengzhen Li, Jiayin Qi, Aimin Zhou, Ji Wu, Ji Liu 26d ago

SCMAPR: Self-Correcting Multi-Agent Prompt Refinement for Complex-Scenario Text-to-Video Generation

SCMAPR stage-wise multi-agent refinement framework for complex scenario text-to-video generation that refines and self-corrects ambiguous prompts through agent collaboration.

Ax Keuntae Kim, Mingyu Kang, Yong Suk Choi 26d ago

Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models

Thinking Diffusion method adding reasoning penalization and guidance to diffusion multimodal LLMs combining Chain-of-Thought reasoning with parallel generation capabilities.

Ax Haoyue Yang, Xuanle Zhao, Xuexin Liu, Feibang Jiang, Yao Zhu 26d ago

OmniDiagram: Advancing Unified Diagram Code Generation via Visual Interrogation Reward

OmniDiagram unified framework for code generation across diverse diagram types and languages using visual interrogation reward for alignment with visual specifications.

Ax Xiaolong Wei, Zerun Zhu, Simin Niu, Xingyu Zhang, Peiying Yu, Changxuan Xiao, Yuchen Li, Jicheng Yang, Zhejun Zhao, Chong Meng, Long Xia, Daiting Shi 26d ago

UniCreative: Unifying Long-form Logic and Short-form Sparkle via Reference-Free Reinforcement Learning

UniCreative approach using reference-free reinforcement learning to balance long-form coherence and short-form expressiveness in LLM-based creative writing generation.

Ax Yushuo Zheng (Affiliation 1, Affiliation 2), Huiyu Duan (Affiliation 1), Zicheng Zhang (Affiliation 1, Affiliation 2), Yucheng Zhu (Affiliation 1), Xiongkuo Min (Affiliation 1), Guangtao Zhai (Affiliation 1, Affiliation 2) 26d ago

Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition

Market-Bench comprehensive benchmark evaluating LLM capabilities in economically-relevant tasks via configurable multi-agent supply chain model with LLM retailer agents.

Ax Chenjie Yang, Yutian Jiang, Anqi Liang, Wei Qi, Chenyu Wu, Junbo Zhang 26d ago

ActivityEditor: Learning to Synthesize Physically Valid Human Mobility

ActivityEditor dual-LLM-agent framework for zero-shot cross-regional human trajectory generation, synthesizing physically valid mobility patterns without region-specific historical data.

Ax Arnaud Liefooghe (LISIC), S\'ebastien Verel (LISIC) 26d ago

Inventory of the 12 007 Low-Dimensional Pseudo-Boolean Landscapes Invariant to Rank, Translation, and Rotation

Analysis of 12,007 rank-invariant pseudo-Boolean landscapes introducing stronger notion of rank landscape equivalence under translation and rotation symmetries.

Ax Chenghao Li, Jun Liu, Songbo Zhang, Huadong Jian, Hao Ni, Lik-Hang Lee, Sung-Ho Bae, Guoqing Wang, Yang Yang, Chaoning Zhang 26d ago

Experience Transfer for Multimodal LLM Agents in Minecraft Game

Echo memory framework for multimodal LLM agents enabling transfer of reusable knowledge across Minecraft tasks by decomposing experience into five interpretable dimensions.

Ax Da Lei, Feng Xiao, Lu Li, Yuzhan Liu 26d ago

SignalClaw: LLM-Guided Evolutionary Synthesis of Interpretable Traffic Signal Control Skills

SignalClaw framework using LLMs as evolutionary skill generators to synthesize interpretable traffic signal control strategies balancing effectiveness and explainability.

Ax Florent Capelli, YooJung Choi, Stefan Mengel, Mart\'in Mu\~noz, Guy Van den Broeck 26d ago

A canonical generalization of OBDD

Introduces Tree Decision Diagrams generalizing OBDD for Boolean function representation with improved succinctness and tractable operations like model counting and conditioning.

Ax Cedric Haufe, Frieder Stolzenburg 26d ago

From Large Language Model Predicates to Logic Tensor Networks: Neurosymbolic Offer Validation in Regulated Procurement

Neurosymbolic approach combining LLMs with Logic Tensor Networks for auditable offer validation in regulated procurement, ensuring factually correct and legally verifiable decisions.

Ax Liyuan Deng, Shujian Deng, Yongkang Chen, Yongkang Dai, Zhihang Zhong, Linyang Li, Xiao Sun, Yilei Shi, Huaxi Huang 26d ago

COSMO-Agent: Tool-Augmented Agent for Closed-loop Optimization,Simulation,and Modeling Orchestration

COSMO-Agent tool-augmented RL framework teaching LLMs to bridge CAD-CAE gap by translating simulation feedback into valid geometric edits for iterative industrial design optimization.

Ax Zhe Zhao, Haibin Wen, Jiaming Ma, Jiachang Zhan, Tianyi Xu, Ye Wei, Qingfu Zhang 26d ago

ResearchEVO: An End-to-End Framework for Automated Scientific Discovery and Documentation

ResearchEVO framework for automated scientific discovery using LLMs to conduct undirected experimentation and generate explanations, instantiating discover-then-explain paradigm computationally.

Ax Xin Sun, Di Wu, Sijing Qin, Isao Echizen, Abdallah El Ali, Saku Sugawara 26d ago

Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

Research on LLM-as-a-Judge showing both humans and LLMs exhibit bias toward human-authored content labels over identical AI-generated content via counterfactual design and eye-tracking.

Ax Amir Konigsberg 26d ago

Beyond Behavior: Why AI Evaluation Needs a Cognitive Revolution

Philosophical critique of behavioral evaluation paradigms for AI systems and proposal for cognitive assessment methods.

Ax Zhiyong Ma, Zhitao Deng, Huan Tang, Jialin Chen, Zhijun Zheng, Zhengping Li, Qingyuan Chuai 26d ago

PECKER: A Precisely Efficient Critical Knowledge Erasure Recipe For Machine Unlearning in Diffusion Models

PECKER algorithm for efficient machine unlearning in diffusion models with directed gradient updates.

Ax Qing Guo, Xinhang Li, Junyu Chen, Zheng Guo, Shengzhe Xu, Lin Zhang, Lei Li 26d ago

CuraLight: Debate-Guided Data Curation for LLM-Centered Traffic Signal Control

CuraLight framework combining RL and LLMs for traffic signal control with debate-guided data curation.

Ax Ojas Jain, Dhruv Kumar 26d ago

LUDOBENCH: Evaluating LLM Behavioural Decision-Making Through Spot-Based Board Game Scenarios in Ludo

LudoBench benchmark evaluating LLM strategic reasoning in Ludo board game with 480 handcrafted scenarios.

Ax Yitong Zhu, Yuxuan Jiang, Guanxuan Jiang, Bojing Hou, Peng Yuan Zhou, Ge Lin Kan, Yuyang Wang 26d ago

QA-MoE: Towards a Continuous Reliability Spectrum with Quality-Aware Mixture of Experts for Robust Multimodal Sentiment Analysis

Quality-aware mixture of experts for multimodal sentiment analysis robust to noise and modality missingness.

Ax Jian Zhao, Haoren Luo, Yu Wang, Yuhan Cao, Pingyue Sheng, Tianxing He 26d ago

Can Large Language Models Reinvent Foundational Algorithms?

Unlearn-and-Reinvent pipeline testing whether LLMs can rediscover foundational algorithms after unlearning removal.

Ax Silja Ke{\ss}ler, Miriam Bautista-Salinero, Claudio Tennie, Charley M. Wu 26d ago

Emergent social transmission of model-based representations without inference

Study on cultural evolution showing minimal social learning can transmit higher-level representations without inference.

Ax Shuai Zhen, Yanhua Yu, Ruopei Guo, Nan Cheng, Yang Deng 26d ago

Hierarchical Reinforcement Learning with Augmented Step-Level Transitions for LLM Agents

Hierarchical RL framework (STEP-HRL) for LLM agents using step-level transitions to reduce computational cost and history length.

Ax Hannah Sansford, Derek H. C. Law, Wei Liu, Abhishek Tripathi, Niresh Agarwal, Gerrit J. J. van den Burg 26d ago

Vision-Guided Iterative Refinement for Frontend Code Generation

Vision-language model critic for automated iterative refinement of frontend code generation with visual feedback loops.

Ax Xiangyue Zhang 26d ago

Deep Researcher Agent: An Autonomous Framework for 24/7 Deep Learning Experimentation with Zero-Cost Monitoring

Open-source framework for autonomous LLM agents conducting deep learning experiments with hypothesis formation, training, and iterative refinement.

Ax Uljad Berdica, Fernando Acero, Anton Ipsen, Parisa Zehtabi, Michael Cashmore, Manuela Veloso 26d ago

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

Diagnostic framework determining when LLMs are necessary for contextual multi-armed bandits with text and numerical context.

Ax Gowthamkumar Nandakishore 26d ago

JTON: A Token-Efficient JSON Superset with Zen Grid Tabular Encoding for Large Language Models

JTON format, JSON superset with Zen Grid encoding for token-efficient structured data processing in LLMs.

Ax Yinan Liu, Dongying Lin, Sigang Luo, Xiaochun Yang, Bin Wang 26d ago

Joint Knowledge Base Completion and Question Answering by Combining Large Language Models and Small Language Models

Joint knowledge base completion and QA using combined large and small language models for KB-related tasks.

Ax Bowen Zeng, Feiyang Ren, Jun Zhang, Xiaoling Gu, Ke Chen, Lidan Shou, Huan Li 26d ago

HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference

KV cache compression technique for multimodal LLM inference, reducing memory overhead and latency with hybrid compression strategy.

Ax TianZe Zhang, Sirui Sun, Yuhang Xie, Xin Zhang, Zhiqiang Wu, Guojie Song 26d ago

Context-Value-Action Architecture for Value-Driven Large Language Model Agents

Architecture for value-driven LLM agents addressing behavioral rigidity through context-value-action design.

Ax Maria Nesterova, Mikhail Kolosov, Anton Andreychuk, Egor Cherepanov, Oleg Bulichev, Alexey Kovalev, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik 26d ago

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Foundation model enabling single GPT-based agent to perform across diverse multi-agent reinforcement learning tasks and environments.

Ax Yi Yuan, Xuhong Wang, Shanzhe Lei 26d ago

Towards Trustworthy Report Generation: A Deep Research Agent with Progressive Confidence Estimation and Calibration

Research agent framework for generating trustworthy reports with confidence estimation and calibration mechanisms.

Ax Renxuan Tan, Rongpeng Li, Zhifeng Zhao, Honggang Zhang 26d ago

Beyond Compromise: Pareto-Lenient Consensus for Efficient Multi-Preference LLM Alignment

Multi-objective preference alignment for LLMs using Pareto-lenient consensus to handle diverse human values in model training.

Ax Eranga Bandara, Ross Gore, Sachin Shetty, Piumi Siyambalapitiya, Sachini Rajapakse, Isurunima Kularathna, Pramoda Karunarathna, Ravi Mukkamala, Peter Foytik, Safdar H. Bouk, Abdul Rahman, Xueping Liang, Amin Hass, Tharaka Hewa, Ng Wee Keong, Kasun De Zoysa, Aruna Withanage, Nilaan Loganathan 26d ago

Flowr -- Scaling Up Retail Supply Chain Operations Through Agentic AI in Large Scale Supermarket Chains

AI agents for retail supply chain operations, automating demand forecasting, procurement, and inventory replenishment in supermarket chains.

Ax Michael Cuccarese 26d ago

Epistemic Blinding: An Inference-Time Protocol for Auditing Prior Contamination in LLM-Assisted Analysis

Proposes epistemic blinding, an inference-time auditing protocol to separate memorized priors from data-driven inference in LLM-assisted agentic analysis systems.

Ax Elisabetta Rocchetti, Alfio Ferrara 26d ago

How LLMs Follow Instructions: Skillful Coordination, Not a Universal Mechanism

Investigates instruction-following mechanisms in LLMs through diagnostic probing, finding evidence for compositional skill deployment over universal mechanism.

Ax Wang Yang, Chaoda Song, Xinpeng Li, Debargha Ganguly, Chuang Ma, Shouren Wang, Zhihao Dou, Yuli Zhou, Vipin Chaudhary, Xiaotian Han 26d ago

ACE-Bench: Agent Configurable Evaluation with Scalable Horizons and Controllable Difficulty under Lightweight Environments

Proposes ACE-Bench, agent evaluation benchmark with unified grid-based planning tasks, lightweight environments, and configurable difficulty/horizon control.

Ax Bowen Ye, Rang Li, Qibin Yang, Yuanxin Liu, Linli Yao, Hanglong Lv, Zhihui Xie, Chenxin An, Lei Li, Lingpeng Kong, Qi Liu, Zhifang Sui, Tong Yang 26d ago

Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents

Introduces Claw-Eval, an end-to-end evaluation suite for autonomous agents addressing trajectory-opaque grading, safety, and interaction modality coverage.

Ax Song-Ju Kim 26d ago

Contextuality as an External Bookkeeping Cost under Fixed Shared-State Semantics

Theoretical analysis of contextuality in quantum information systems as external bookkeeping cost under classical simulation.

Ax Uday Allu, Sonu Kedia, Tanmay Odapally, Biddwan Ahmed 26d ago

Web Retrieval-Aware Chunking (W-RAC) for Efficient and Cost-Effective Retrieval-Augmented Generation Systems

Proposes Web Retrieval-Aware Chunking (W-RAC) for efficient RAG document chunking to balance retrieval quality, latency, and cost on web-scale content.

Ax Jiaquan Zhang, Qigan Sun, Chaoning Zhang, Xudong Wang, Zhenzhen Huang, Yitian Zhou, Pengcheng Zheng, Chi-lok Andy Tai, Sung-Ho Bae, Zeyu Ma, Caiyan Qin, Jinyu Guo, Yang Yang, Hengtao Shen 26d ago

TDA-RC: Task-Driven Alignment for Knowledge-Based Reasoning Chains in Large Language Models

Proposes Task-Driven Alignment (TDA-RC) for improving reasoning chains in LLMs by bridging logical gaps between CoT and multi-round thought paradigms.

Ax Julian Coda-Forno, Jane X. Wang, Arslan Chaudhry 26d ago

The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse

Evaluates bidirectional training objectives (MLM, masked attention) to mitigate the reversal curse in autoregressive language models.

Ax Mohammad Reza Ghasemi Madani, Soyeon Caren Han, Shuo Yang, Jey Han Lau 26d ago

Inclusion-of-Thoughts: Mitigating Preference Instability via Purifying the Decision Space

Introduces Inclusion-of-Thoughts (IoT), a strategy to reduce LLM instability on multiple-choice questions by filtering irrelevant distractors.

Ax Nitish Kumar, Sannu Kumar, S Akash, Manish Gupta, Ankith Karat, Sriparna Saha 26d ago

SUMMIR: A Hallucination-Aware Framework for Ranking Sports Insights from LLMs

Proposes SUMMIR framework for ranking sports insights extracted by LLMs, addressing hallucinations with 7,900-article dataset across four sports.

Ax Jos\'e Guilherme Marques dos Santos, Ricardo Yang, Rui Humberto Pereira, Alexandre Sousa, Br\'igida M\'onica Faria, Henrique Lopes Cardoso, Jos\'e Duarte, Jos\'e Lu\'is Reis, Lu\'is Paulo Reis, Pedro Pimenta, Jos\'e Paulo Marques dos Santos 26d ago