Isolater - Feed

Ax Andreas Plesner, Francisco Guzm\'an, Anish Athalye 7d ago

An Imperfect Verifier is Good Enough: Learning with Noisy Rewards

Study of RLVR robustness to noisy verifiers in LLM post-training, analyzing required verifier accuracy for effective training.

Ax Tao Li, Kaiyuan Hou, Tuan Vinh, Monika Raj, Zhichun Guo, Carl Yang 7d ago

Reinforcement Learning with LLM-Guided Action Spaces for Synthesizable Lead Optimization

RL with LLM-guided action spaces for drug lead optimization, combining LLMs with synthesis feasibility constraints.

Ax Xiaohuan Li, Junchuan Fan, Bingqi Zhang, Rong Yu, Xumin Huang, Qian Chen 7d ago

Joint Task Offloading, Inference Optimization and UAV Trajectory Planning for Generative AI Empowered Intelligent Transportation Digital Twin

GAI-empowered intelligent transportation digital twin using UAVs with diffusion models for processing roadside sensor data.

Ax Micky C. Nnamdi, Benoit L. Marteau, Yishan Zhong, J. Ben Tamo, May D. Wang 7d ago

Tree-of-Evidence: Efficient "System 2" Search for Faithful Multimodal Grounding

Tree-of-Evidence inference algorithm for faithful multimodal model grounding with interpretable reasoning in healthcare and high-stakes domains.

Ax Ziyi Ding, Xianxin Lai, Weiyu Chen, Xiao-Ping Zhang, Jiayu Chen 7d ago

CausalVAE as a Plug-in for World Models: Towards Reliable Counterfactual Dynamics

CausalVAE as plug-in module for world models improving counterfactual dynamics prediction and robustness under distribution shift.

Ax Fabricio Maci\`a, Shu Nakamura 7d ago

Mathematical analysis of one-layer neural network with fixed biases, a new activation function and other observations

Theoretical analysis of single hidden-layer neural networks with ReLU, fixed biases, proving convergence and spectral bias properties.

Ax Yasong Fan 7d ago

MIPT-SSM: Scaling Language Models with $O(1)$ Inference Cache via Phase Transitions

MIPT-SSM sequence architecture using measurement-induced phase transitions to achieve O(1) inference cache for language models.

Ax Xing Han L\`u, Siva Reddy 7d ago

Structured Distillation of Web Agent Capabilities Enables Generalization

Agent-as-Annotators framework distilling web navigation capabilities from Gemini 3 Pro into smaller models via structured trajectory generation.

Ax Junlong Jia, Ziyang Chen, Xing Wu, Chaochen Gao, TingHao Yu, Feng Zhang, Songlin Hu 7d ago

PolicyLong: Towards On-Policy Context Extension

PolicyLong method for extending LLM context windows using on-policy data synthesis to align with model capabilities during training.

Ax Jasper Zhang, Bryan Cheng 7d ago

Information-Theoretic Requirements for Gradient-Based Task Affinity Estimation in Multi-Task Learning

Information-theoretic framework for predicting task affinity in multi-task learning, addressing gradient-based task relationship estimation.

Ax Hao Gu, Hao Wang, Jiacheng Liu, Lujun Li, Qiyuan Zhu, Bei Liu, Binxing Xu, Lei Wang, Xintong Yang, Sida Lin, Sirui Han, Yike Guo 7d ago

QaRL: Rollout-Aligned Quantization-Aware RL for Fast and Stable Training under Training--Inference Mismatch

QaRL method for LLM RL training addressing training-inference mismatch by aligning quantized rollouts with learning updates.

Ax Binxing Xu, Hao Gu, Lujun Li, Hao Wang, Bei Liu, Jiacheng Liu, Qiyuan Zhu, Xintong Yang, Chao Li, Sirui Han, Yike Guo 7d ago

Bit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMs

Progressive quantization-aware training framework for ultra-low-bit LLMs using outlier channel splitting to stabilize convergence.

Ax Mingqing Xiao, Yansen Wang, Dongqi Han, Caihua Shan, Dongsheng Li 7d ago

Kuramoto Oscillatory Phase Encoding: Neuro-inspired Synchronization for Improved Learning Efficiency

Neuro-inspired architecture using Kuramoto oscillatory phase encoding to improve learning efficiency by incorporating phase dynamics.

Ax Michela Lapenna, Rita Fioresi, Bahman Gharesifard 7d ago

Sinkhorn doubly stochastic attention rank decay analysis

Theoretical analysis of doubly stochastic attention in Transformers, examining rank decay and signal degradation across layers.

Ax Jing Wang, Yu-Yang Qian, Ke Xue, Chao Qian, Peng Zhao, Zhi-Hua Zhou 7d ago

Robust Length Prediction: A Perspective from Heavy-Tailed Prompt-Conditioned Distributions

Analysis of output-length prediction for efficient LLM serving, examining prompt-conditioned length distributions for batching and memory optimization.

Ax Ivan Tjuawinata, Andre Gunawan, Anh Quan Tran, Nitish Kumar, Payal Pote, Harsh Bansal, Chu-Hung Chi, Kwok-Yan Lam, Parventanis Murthy 7d ago

A Systematic Framework for Tabular Data Disentanglement

Framework for tabular data disentanglement transforming complex attribute relationships into latent variables with reduced interdependencies.

Ax Ranya Batsyas, Ritesh Yaduwanshi 7d ago

Fraud Detection System for Banking Transactions

ML-based fraud detection framework for banking transactions using PaySim synthetic dataset to address imbalanced classification.

Ax Raphael Fischer, Angus Dempster, Sebastian Buschj\"ager, Matthias Jakobs, Urav Maniar, Geoffrey I. Webb 7d ago

Pruning Extensions and Efficiency Trade-Offs for Sustainable Time Series Classification

Framework evaluating time series classification methods across performance and energy efficiency, exploring pruning and resource consumption trade-offs.

Ax Shuaiting Li, Juncan Deng, Kedong Xu, Rongtao Deng, Hong Gu, Minghan Jiang, Haibin Shen, Kejie Huang 7d ago

Rethinking Residual Errors in Compensation-based LLM Quantization

Analysis of weight compensation methods for LLM quantization, examining residual errors in techniques like GPTQ for reducing model precision.

Ax Eleni Triantafillou, Ahmed Imtiaz Humayun, Monica Ribero, Alexander Matt Turner, Michael C. Mozer, Georgios Kaissis 7d ago

Is your algorithm unlearning or untraining?

Research clarifying distinction between machine unlearning and untraining—different approaches to removing data points or behaviors from trained models.

Ax Anthony T. Wu, Arghavan Rezvani, Kela Liu, Roozbeh Houshyar, Pooya Khosravi, Whitney Li, Xiaohui Xie 7d ago

Benchmarking Deep Learning for Future Liver Remnant Segmentation in Colorectal Liver Metastasis

Deep learning benchmarking for liver remnant segmentation in surgical planning for colorectal metastases using medical imaging data.

Ax Ioannis Nasios 7d ago

The ecosystem of machine learning competitions: Platforms, participants, and their impact on AI development

Analysis of ML competition platforms (Kaggle, Zindi) examining workflows, evaluation methods, participant expertise, and impact on AI development.

Ax Dominik Seip, Matthias Hein 7d ago

Preference Redirection via Attention Concentration: An Attack on Computer Use Agents

Attack on Computer Use Agents targeting vision modality through attention concentration to redirect preferences.

Ax Lena Marie Budde, Ayan Majumdar, Richard Uth, Markus Langer, Isabel Valera 7d ago

From Universal to Individualized Actionability: Revisiting Personalization in Algorithmic Recourse

Framework for personalized algorithmic recourse providing actionable recommendations with explicit user considerations.

Ax Mohsen Amiri, Mohsen Amiri, Ali Beikmohammadi, Sindri Magnu\'sson, Mehdi Hosseinzadeh 7d ago

PriPG-RL: Privileged Planner-Guided Reinforcement Learning for Partially Observable Systems with Anytime-Feasible MPC

RL method for partial observability using privileged planner guidance during training with MPC.

Ax Dian S. Y. Pang, Endrias Y. Ergetu, Eric Topham, Ahmed E. Fetit 7d ago

Automating aggregation strategy selection in federated learning

Framework automating aggregation strategy selection in federated learning across heterogeneous settings.

Ax Ashutosh Adhikari, Mirella Lapata 7d ago

Multimodal Latent Reasoning via Predictive Embeddings

Pearl framework for multimodal reasoning using predictive embeddings to reduce tool-use overhead in VLMs.

Ax Yunusa Haruna, Adamu Lawan, Ibrahim Haruna Abdulhamid, Hamza Mohammed Dauda, Jiaquan Zhang, Chaoning Zhang, Shamsuddeen Hassan Muhammad 7d ago

Bias Redistribution in Visual Machine Unlearning: Does Forgetting One Group Harm Another?

Studies bias redistribution when vision models selectively unlearn demographic groups.

Ax Baihui Liu, Kaiyuan Tian, Wei Wang, Zhaoning Zhang, Linbo Qiao, Dongsheng Li 7d ago

Alloc-MoE: Budget-Aware Expert Activation Allocation for Efficient Mixture-of-Experts Inference

Efficient MoE inference through budget-aware expert activation allocation reducing latency bottlenecks.

Ax Zhen Li (LMO, CELESTE, HEC Paris), Gilles Stoltz (LMO, CELESTE, HEC Paris) 7d ago

A Direct Approach for Handling Contextual Bandits with Latent State Dynamics

Bandit algorithm for contextual decision-making with latent hidden Markov chain dynamics.

Ax Teng Pang, Zhiqiang Dong, Yan Zhang, Rongjian Xu, Guoqiang Wu, Yilong Yin 7d ago

Value-Guidance MeanFlow for Offline Multi-Agent Reinforcement Learning

Flow-based method for offline multi-agent reinforcement learning using value guidance.

Ax Andrii Dzhoha, Egor Malykh 7d ago

Long-Term Embeddings for Balanced Personalization

Recommender system using long-term embeddings to balance recency bias and stable user preferences.

Ax Yunxiang Peng, Mengmeng Ma, Ziyu Yao, Xi Peng 7d ago

Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings

Method for assessing model generalization in Vision Transformers via internal representations under distribution shift.

Ax Yichen Gao, Altay Unal, Akshay Rangamani, Zhihui Zhu 7d ago

An Illusion of Unlearning? Assessing Machine Unlearning Through Internal Representations

Examines vulnerabilities in machine unlearning methods by analyzing internal representations and concept reintroduction.

Ax Zigeng Chen, Gongfan Fang, Xinyin Ma, Ruonan Yu, Xinchao Wang 7d ago

DMax: Aggressive Parallel Decoding for dLLMs

DMax enables efficient parallel decoding in diffusion language models through progressive self-refinement.

Ax Marcus Armstrong, Navid Ayoobi, Arjun Mukherjee 7d ago

Dead Weights, Live Signals: Feedforward Graphs of Frozen Language Models

Architecture combining frozen LLMs as nodes communicating through learned projections in a shared latent space.

Ax Danit Yanowsky, Daphna Weinshall 7d ago

Leveraging Complementary Embeddings for Replay Selection in Continual Learning with Small Buffers

Continual learning method using complementary self-supervised embeddings to improve replay buffer sample selection.

Ax Qiance Tang, Ziqi Wang, Jieyu Lin, Ziyun Li, Barbara De Salvo, Sai Qian Zhang 7d ago

EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment

Benchmark for egocentric video understanding in AR using long-context reasoning over temporal activities.

Ax Constantin Le Cle\"i, Nils Th\"urey, Xiaoxiang Zhu 7d ago

Bias-Constrained Diffusion Schedules for PDE Emulations: Reconstruction Error Minimization and Efficient Unrolled Training

Bias-constrained diffusion models for PDE emulation with improved accuracy and training efficiency.

Ax Tolga Dimlioglu, Nadine Chang, Maying Shen, Rafid Mahmood, Jose M. Alvarez 7d ago

Scaling-Aware Data Selection for End-to-End Autonomous Driving Systems

Data selection framework for autonomous driving models balancing multiple evaluation metrics.

Ax Seyed Mahmoud Sajjadi Mohammadabadi, Xiaolong Ma, Lei Yang, Feng Yan, Junshan Zhang 7d ago

SOLAR: Communication-Efficient Model Adaptation via Subspace-Oriented Latent Adapter Reparametrization

Parameter-efficient fine-tuning compression framework reducing communication costs for model adaptation.

Ax Paul Quinlan, Qingguo Li, Xiaodan Zhu 7d ago

ADAPTive Input Training for Many-to-One Pre-Training on Time-Series Classification

Self-supervised pre-training method for time series classification with adaptive input handling.

Ax Simon Zhang, Ryan P. DeMilt, Kun Jin, Cathy H. Xia 7d ago

Adversarial Label Invariant Graph Data Augmentations for Out-of-Distribution Generalization

Data augmentation method using adversarial training for out-of-distribution generalization on graphs.

Ax Andrey Bocharnikov, Ivan Ermakov, Denis Kuznedelev, Vyacheslav Zhdanovskiy, Yegor Yershov 7d ago

KV Cache Offloading for Context-Intensive Tasks

KV cache offloading technique to reduce memory and latency overhead for long-context LLM inference.

Ax Haokai Ma, Lee Yan Zhen, Gang Yang, Yunshan Ma, Ee-Chien Chang, Tat-Seng Chua 7d ago

Less Approximates More: Harmonizing Performance and Confidence Faithfulness via Hybrid Post-Training for High-Stakes Tasks

Hybrid post-training combining reinforcement learning and distillation to improve LLM confidence calibration.

Ax Milad Leyli-Abadi, Lucas Thil, Sebastien Razakarivony, Guillaume Doquet, Jesse Read 7d ago

A Machine Learning Framework for Turbofan Health Estimation via Inverse Problem Formulation

Machine learning framework for estimating turbofan engine health from sensor data.

Ax Sikai Bai, Haoxi Li, Jie Zhang, Yongjiang Liu, Song Guo 7d ago

TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis

Test-time variational synthesis method for reinforcement learning in domains without verifiable rewards.

Ax Abdelkarim Loukili 7d ago

Quantization Impact on the Accuracy and Communication Efficiency Trade-off in Federated Learning for Aerospace Predictive Maintenance

Impact of quantization on federated learning accuracy-efficiency trade-offs for aerospace predictive maintenance.

Ax Tobias Schumacher, Simon Reichelt, Markus Strohmaier 7d ago

The Impact of Dimensionality on the Stability of Node Embeddings

Analysis of how embedding dimensionality affects stability of graph node embeddings.

Ax Stephen Cheng, Sarah Wiegreffe, Dinesh Manocha 7d ago

What Drives Representation Steering? A Mechanistic Case Study on Steering Refusal

Mechanistic study of how steering vectors modify LLM behavior for alignment and refusal control.